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Preface 





How this book came about 


xii 


This book is the outcome of a long cherished ambition to write a follow-up to my book 
Theoretical Concepts in Physics (TCP2) (Longair, 2003). In that book, I took the story of 
the development of theoretical concepts in physics up to the discovery of quanta and the 
acceptance by the physics community that quanta and quantisation are essential features of 
the new physics of the early twentieth century. There was neither space nor scope to take 
that story further — it was just too complicated and would have required more advanced 
mathematics than I wished to include in that volume. 

This book is my attempt to do for quantum mechanics what I did for classical physics 
and relativity in TCP2. The objective is to try to reconstruct as closely as possible the 
way in which quantum mechanics was created out of a mass of diverse experimental data 
and mathematical analyses through the period from about 1900 to 1930. In my view, 
quantisation and quanta are the greatest discoveries in the physics of the twentieth century. 
The phenomena of quantum mechanics have no direct impact upon our consciousness 
which to all intents and purposes is a world dominated by classical physics. But quantum 
mechanics underlies all the phenomena of matter and radiation and is the basis of essentially 
all aspects of civilisation in the twenty-first century. 

There is no lack of excellent books on quantum mechanics which is one of the staples of all 
courses in undergraduate physics. Most of the successful texts adopt an axiomatic approach 
in which quantum mechanics is derived from a set of basic axioms, the consequences 
of which are elucidated in the subsequent mathematical elaboration. The first complete 
exposition of this approach was Dirac’s classic book Principles of Quantum Mechanics of 
1930 which may be thought of as the ultimate goal of this book (Dirac, 1930a). But how 
did it all come about? Can we understand why the theory has to be as complex as it is and 
how did the interpretation of the formalism come about? 

Just as the core of TCP2 was inspired by the essays of Martin J. Klein (1967), so this 
book was inspired long ago by the book Sources of Quantum Mechanics edited by B. L. van 
der Waerden (1967). I had an ambition to use van der Waerden’s book as the basis of the 
equivalent of TCP2 for the development of quantum mechanics. This was reinforced by 
the appearance of the massive six-volume series The Historical Development of Quantum 
Theory by Jagdish Mehra and Helmut Rechenberg which provides a very thorough, au- 
thoritative survey of the history of quantum mechanics and which were published between 
1982 and 2001 (Mehra and Rechenberg, 1982a,b,c,d, 1987, 2000, 2001). Equally inspiring 
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was The Conceptual Development of Quantum Mechanics by Max Jammer which covers 
similar ground in a single volume (Jammer, 1989). Another inspiration was the book Inward 
Bound by Abraham Pais (1985) which sets the development of quantum mechanics and 
quantum phenomena in a much longer time-frame. In my view, these truly excellent books 
are quite hard work and can only be readily appreciated by those who already have a strong 
foundation in classical and quantum physics. They are quite a challenge for those seeking 
more readily accessible enlightenment. 


The historical approach and level of presentation 
E) 


The experience of teaching and writing a number of books convinced me of the value of 
rethinking the foundations of physics from a somewhat historical perspective, at the same 
time making as few assumptions as is reasonable about the mathematical sophistication of 
the reader. As in TCP2, I assume some fluency in physics and mathematics, but nothing 
that would be beyond the first couple of years of the typical course in physics. It is useful 
to restate some of the objectives of TCP2 which apply equally to the approach adopted in 
this book, in contrast to the standard way in which the subject is tackled. 

The origin of TCP2 can be traced to discussions in the Cavendish Laboratory in the mid- 
1970s among those of us who were involved in teaching theoretically biased undergraduate 
courses. There was a feeling that the syllabuses lacked coherence from the theoretical 
perspective and that the students were not quite clear about the scope of physics as opposed 
to theoretical physics. As our ideas evolved, it became apparent that a discussion of these 
ideas would be of value for all final-year students. The course entitled Theoretical Concepts 
in Physics was therefore designed to be given in the summer term in July and August to 
undergraduates entering their final year. It was to be strictly non-examinable and entirely 
optional. Students obtained no credit from having attended the course beyond an increased 
appreciation of physics and theoretical physics. I was invited to give this course of lectures 
for the first time, with the considerable challenge of attracting students to 9.00 am lectures on 
Mondays, Wednesdays and Fridays during the most glorious summer months in Cambridge. 

The course was designed to contain the following elements: 


(a) The interaction between experiment and theory. Particular stress would be laid upon 
the importance of experiment and, in particular, the role of advanced technology in 
leading to theoretical insights. 

(b) The importance of having available the appropriate mathematical tools for tackling 
theoretical problems. 

(c) The theoretical background to the basic concepts of modern physics, emphasising 
underlying themes such as symmetry, conservation, invariance, and so on. 

(d) The role of approximations and models in physics. 

(e) The analysis of real scientific papers in theoretical physics, providing insight into how 
professional physicists tackle real problems. 
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(f) The consolidation and revision of many of the basic physical concepts which all final- 
year undergraduates can reasonably be expected to have at their fingertips. 

(g) Finally, to convey my own personal enthusiasm for physics and theoretical physics. 
My own research has been in high energy astrophysics and astrophysical cosmology, 
but I remain a physicist at heart. My own view is that astronomy, astrophysics and 
cosmology are no more than subsets of physics, but applied to the Universe on the large 
scale. I am one of the very lucky generation who began research in astrophysics in 
the early 1960s and who have witnessed the astonishing revolutions which have taken 
place in our understanding of all aspects of the physics of the Universe. But, the same 
can be said of all areas of physics. The subject is not a dead, pedagogic discipline, the 
only object of which is to provide examination questions for students. It is an active, 
extensive subject in a robust state of good health. 


My objective in writing Quantum Concepts in Physics has been to adopt the same user- 
friendly approach as in TCP2 but now applied to the discovery of quantum mechanics. 
I should emphasise that this is a personal approach to the understanding of quantum 
mechanics, but it has the great virtue of forcing the writer and reader to think hard about the 
issues at stake at each stage in the development through one of the most dramatic periods 
in the evolution of our understanding of fundamental processes in physics. One of the 
differences as compared with TCP2 is that somewhat more advanced mathematical tools 
have to be introduced to appreciate the full essence of the story. I have tried to lay out the 
necessary mathematics in as simple a form as I could devise, without sacrificing rigour. In 
my view, final-year undergraduates and their teachers should have little trouble in coping 
with these requirements. 

Let me also emphasise that this book is not a textbook on quantum mechanics. It is 
certainly nota substitute for the systematic development of these topics through the standard 
axiomatic approach to the discipline. You should regard this book as a supplement to the 
standard courses, but one which I hope will enhance your understanding, appreciation and 
enjoyment of the physics. Certainly, I have learned a huge amount about quantum mechanics 
through studying the works of genius of the pioneers of the subject. 


The challenge 


Let me make it clear at the outset that the amount of material which has to be condensed 
into a single manageable volume is immense. Some impression of the magnitude of the 
task can be appreciated from the almost 4500 pages of the magnificent series by Mehra and 
Rechenberg. In addition, the history of physics literature is vast. As a result, I have had to 
be selective, and although the course is tortuous, I have had to streamline the story to reach 
my goal in a finite space. For further enlightenment, which I thoroughly recommend, there 
is no alternative but to delve into the writings of Mehra, Rechenberg, Jammer, Pais and the 
many other authors cited in the text. 
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I should also confess that, although I have taught numerous courses on quantum physics, 
I do not regard myself as a ‘black-belt’ quantum physicist. This has the advantage that I 
am embarking on a voyage of personal intellectual discovery as well. I like very much the 
splendid remark of Fitzgerald, 


‘A Briton wants emotion in his science, something to raise enthusiasm, something with 
human interest.’ (Fitzgerald, 1902) 


I confess to belonging to that school. I hope you will enjoy this adventure as much as I do. 


Malcolm Longair 
Cambridge and Venice, 2012 
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Physics and theoretical physics in 1895 





1.1 The triumph of nineteenth century physics 
OO ——————_——————————— ae 


The nineteenth century was an era of unprecedented advance in the understanding of the laws 
of physics. In mechanics and dynamics, more and more powerful mathematical tools had 
been developed to enable complex dynamical problems to be solved. In thermodynamics, 
the first and second laws were firmly established, through the efforts of Rudolf Clausius 
and William Thomson (Lord Kelvin), and the full ramifications of the concept of entropy 
for classical thermodynamics were being elaborated. James Clerk Maxwell had derived 
the equations of electromagnetism which were convincingly validated by Heinrich Hertz’s 
experiments of 1887 to 1889. Light and electromagnetic waves were the same thing, thus 
providing a firm theoretical foundation for the wave theory of light which could account 
for virtually all the known phenomena of optics. 

Sometimes the impression is given that experimental and theoretical physicists of the 
1890s believed that the combination of thermodynamics, electromagnetism and classical 
mechanics could account for all known physical phenomena and that all that remained 
was to work out the consequences of these recently won achievements. As remarked 
by Brian Pippard in his survey of physics in 1900,! Albert Michelson’s famous remark 
that 


‘Our future discoveries must be looked for in the sixth place of decimals.’ (Michelson, 
1903) 


has often been quoted out of context and is better viewed in the light of Maxwell’s words 
in his inaugural lecture as the first Cavendish Professor of Experimental Physics in 1871: 


‘I might bring forward instances gathered from every branch of science, showing how 
the labour of careful measurement has been rewarded by the discovery of new fields of 
research, and by the development of new scientific ideas.’ (Maxwell, 1890) 


Maxwell’s prescient words were the battle-cry for the extraordinary events which were 
to take place over the succeeding decades. In fact, the late nineteenth century was a period 
of ferment in the physical sciences when many awkward fundamental problems remained 
to be solved. These exercised the minds of the greatest physicists of the period. Ultimately, 
the resolution of these problems was to revolutionise the foundations of physics with the 
discovery of the wholly different world of quantum mechanics. Let us begin by reviewing 
some of the issues which led to the crisis of early twentieth century physics.” 
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1.2 Atoms and molecules in the nineteenth century 
>> Ss >= —————————————————a 


The origin ofthe modern concept ofatoms and molecules can be traced to the understanding 
of the laws of chemistry in the early years of the nineteenth century. In the late eighteenth 
century Antoine-Laurent de Lavoisier established the law of conservation of mass in chem- 
ical reactions. Then, in the period between 1798 and 1804, Joseph Louis Proust established 
his law of definite proportions, according to which: 


‘A given chemical compound contains the same elements united in the same fixed pro- 
portions by mass.” 


For example, oxygen makes up 8/9 of the mass of any sample of pure water, while hydrogen 
makes up the remaining 1/9. In 1803, John Dalton followed up this law with his law of 
multiple proportions according to which: 


“When two elements combine together to form more than one compound, the weights 
of one element which unites with a given weight of the other are in simple multiple 
proportion.’ 


Next, in 1808 Joseph-Louis Gay-Lussac published his Jaw of combining volumes of gases, 
which states: 


“The volumes of gases taking part in a chemical change either as reagents or as products, 
bear a simple numerical relation to one another if all measurements are made under the 
same conditions of temperature and pressure.’ 


For example, two volumes of hydrogen react with one volume of oxygen in forming two 
volumes of water vapour. 

These concepts were synthesised and taken much further by Dalton in his influential 
treatise A New System of Chemical Philosophy (Dalton, 1808). He asserted that the ultimate 
particles, or atoms, of a chemically homogeneous substance all had the same weight and 
shape and drew up a table of the relative weights of the atoms of a number of simple 
substances (Fig. 1.1). According to his hypothesis, the atoms are particles of matter which 
cannot be subdivided into more primitive forms by chemical processes. Originally, Dalton 
and Berzelius considered that equal volumes of gases under identical physical conditions 
contain the same number of atoms, but this concept did not agree with the observed 
relations of the volumes of different combining gases. The solution was provided by 
Amadeo Avogadro who in 1811 realised that the physical unit was not a single atom but 
a cluster of a small number of atoms, which he defined as molecules, namely the smallest 
particle of the gas which moves about as a whole. Avogardo 5 hypothesis then states that: 


“Equal volumes of all gases under the same conditions of temperature and pressure contain 
the same numbers of molecules.’ 


A central role in what follows concerned the molecular, or atomic, weight of a substance. 
This is defined to be the weight of the particle on a scale in which the oxygen atom has 
weight 16 units. Correspondingly, the gram-molecular weight was defined as the weight 
of the particle on a scale in which the weight of oxygen was 16 gram.* Consequently, 
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Dalton’s symbols for the atoms of various elements and their compounds (Dalton, 1808). 


the gram-molecular weights of all substances contain the same numbers of molecules. 
Avogadro’s hypothesis was disregarded by chemists for almost 50 years until 1858 when 
Stanislao Cannizzaro convinced the leading chemists of the truth of the hypothesis. 

A major preoccupation was putting some order into the properties of the chemical 
elements. In 1789, Lavoisier published a list of 33 chemical elements and grouped them 
into gases, metals, non-metals and earths. The search was on for a more precise classification 
scheme. It was known that certain groups of elements had similar chemical properties — 
for example, the alkali metals sodium, potassium and rubidium and the halogens, chlorine, 
bromine and iodine. Dmitri Mendeleyev in 1869 and Julius Meyer in 1870 independently 
published what became known as the periodic table of the elements. The tables were 
constructed by listing the elements in order of increasing atomic weight and then starting a 
new column when similar classes of elements appeared (Fig. 1.2). Mendeleyev left gaps in 
the table if an appropriate element had not yet been discovered and then used the trends to 
predict the properties of the missing elements. Examples included scandium, gallium and 
germanium. On occasion, he altered the order by atomic weight in order to match similar 
elements in different rows. 
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ONNTDL CHCTEMH SABMEHTOBY. 
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A. Mennextent 
Mendeleyev’s original version of the periodic table of 1869 (Mendeleyev, 1869). The question marks indicate unknown 
elements inserted so that similar elements would lie along the same row. 


The process of filling in the elements in the periodic table continued throughout the 
nineteenth century. The understanding of the physics of the atoms of different elements 
was to be a major concern of Niels Bohr as he struggled to incorporate them into the old 
quantum theory in the early 1920s. The chemists did not really need the atomic hypothesis 
to make progress, but rather the empirical rules described above were sufficient to enable 
remarkable progress to be made in the understanding of chemical processes. As the interest 
ofthe chemists waned, however, the physicists took up the reins with the need to provide a 
microscopic interpretation of the laws of thermodynamics. 


1.3 The kinetic theory of gases and 


Boltzmann's statistical mechanics 
E) 


The discovery of the first and second laws of thermodynamics in the early 1850s placed 
thermodynamics on a firm theoretical foundation and these laws were to be elucidated 
by the next generation of theoretical physicists. Among the challenges was the physical 
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1.3 The kinetic theory of gases 


interpretation of the law of increase of entropy which exercised the greatest minds of the 
subsequent period, including Clausius, Maxwell, Boltzmann and Planck. 


1.3.1 The kinetic theory of gases 


The laws of thermodynamics describe the properties of matter in bulk. In fact, the theory 
denies that there is any microscopic structure, its great merit being that it provides general 
relations between the macroscopic properties of material systems. Nonetheless, Clausius 
and Maxwell had no hesitation in developing the kinetic theory of gases, considering them 
to consist of vast numbers of particles making continuous elastic collisions with each other 
and the walls of the containing vessel. Clausius provided the first systematic account of 
the theory in 1857 in his paper entitled On the nature of the motion, which we call heat 
(Clausius, 1857). He succeeded in deriving the equation of state of a monatomic gas by 
working in terms of the mean velocities of the particles. Whilst accounting for the perfect 
gas law admirably, it did not give good agreement with the known values of the ratios of their 
specific heat capacities, y = C,/Cy for molecular gases, where C, and Cy are the specific 
heat capacities at constant pressure and constant volume respectively. From experiment, y 
was found to be 1.4 for molecular gases, whereas the kinetic theory predicted y = 1.67. 
In the last sentence of his paper, Clausius recognised the important point that there must 
therefore exist other means of storing kinetic energy within molecular gases which can 
increase their internal energy per molecule. 

One feature of Clausius’s work was of particular significance for Maxwell. From the 
kinetic theory, Clausius worked out the mean velocities of air molecules from his formula 
RT = iN Mu?. For oxygen and nitrogen, he deduced velocities of 461 and 492 m s~! 
respectively. The Dutch meteorologist Christoph Buys Ballot criticised this aspect of the 
theory, since it is well known that pungent odours take minutes to permeate a room. 
Clausius’s response was that the air molecules collide with each other and therefore diffuse 
from one part of a volume to another, rather than propagate in straight lines. In his paper, 
Clausius introduced the concept of the mean free path of the atoms and molecules of gases 
for the first time (Clausius, 1858). Thus, in the kinetic theory of gases, it must be supposed 
that there are continually collisions between the molecules. 

Both papers by Clausius were known to Maxwell when he turned to the problem of the 
kinetic theory of gases in 1859 and 1860. His work was published in 1860 in a character- 
istically novel and profound series of papers entitled ///ustrations of the dynamical theory 
of gases (Maxwell, 1860a,b,c). In a few brief paragraphs,” he derived the formula for the 
velocity distribution f (u) of the particles of the gas and introduced statistical concepts into 
the kinetic theory of gases and thermodynamics, 


m \3/2 4 mu? 
fu) du = 47 (<>) u? exp (-55) du. (1.1) 
Maxwell immediately noted 


‘that the velocities are distributed among the particles according to the same law as the 
errors are distributed among the observations in the theory of the method of least squares.’ 
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Francis Everitt has written that this derivation of Maxwell s velocity distribution marks the 
beginning of a new epoch in physics (Everitt, 1975). The statistical nature of the laws of 
thermodynamics and the modern theory of statistical mechanics follow directly from his 
analysis. 

Maxwell, however, ran up against exactly the same problem as Clausius. If only the 
translational degrees of freedom are taken into account, the value of y should be 1.67. 
Maxwell also considered the case in which the rotational degrees of freedom of non- 
spherical molecules were taken into account as well as their translational motions, but this 
calculation resulted in a ratio of specific heat capacities y = 1.33, again inconsistent with 
the value 1.4 observed in the common molecular gases. In the last sentence of his great 
paper, he makes the discouraging remark: 


‘Finally, by establishing a necessary relation between the motions of translation and 
rotation of all particles not spherical, we proved that a system of such particles could not 
possibly satisfy the known relation between the two specific heats of all gases.’ 


1.3.2 The viscosity of gases 


Despite this difficulty, Maxwell immediately applied the kinetic theory of gases to their 
transport properties — diffusion, thermal conductivity and viscosity. His calculation of the 
coefficient of viscosity of gases was of special importance.° Specifically, he worked out 
how the coefficient of dynamic or absolute viscosity n is expected to change with pressure 
and temperature. He found the result 


(1.2) 


where X is the mean free path of the molecules, w is their mean velocity, n their number 
density and o the collision cross-section of the molecules—A and ø are related by à = 1/no. 
Maxwell was surprised to find that the coefficient of viscosity is independent of the pressure, 
since there is no dependence upon number density n in (1.2). The reason is that, although 
there are fewer molecules per unit volume as n decreases, the mean free path increases as 
n—', enabling the increment of momentum transfer to take place over greater distances. 
Furthermore, as the temperature of the gas increases, 7 increases as 7'/. Therefore, the 
viscosity of a gas should increase with temperature, unlike the behaviour of liquids. This 
somewhat counter-intuitive result was the subject of a brilliant set of experiments carried 
out by Maxwell from 1863 to 1865 (Fig. 1.3). He confirmed the prediction of the kinetic 
theory that the viscosity of gases is independent of the pressure. He expected to discover 
the 7'/ law as well, but in fact found a stronger dependence, n « T. 

In his great paper of 1867, he interpreted this result as indicating that there must be 
a repulsive force between the molecules which varied with distance r as r~>. This was 
a profound discovery since it meant that there was no longer any need to consider the 
molecules to be ‘elastic spheres of definite radius’ (Maxwell, 1867). The repulsive force, 
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(a) Maxwell's apparatus for measuring the viscosity of gases. The gas fills the chamber and the glass discs oscillate as a 
torsion pendulum. The viscosity of the gas is found by measuring the rate of decay of the oscillations of the torsion 
balance. The oscillations of the torsion balance were measured by reflecting a light beam from the mirror attached to 
the suspension. The pressure and temperature of the gas could be varied. The oscillations were started magnetically 
since the volume of the chamber had to be perfectly sealed. (b) Maxwell’s apparatus on display in the Cavendish 
Laboratory. 
proportional to r75, meant that encounters between molecules would take the form of 
deflections through different angles, depending upon the impact parameter. Maxwell showed 
that it was more appropriate to think in terms of a relaxation time, roughly the time it would 
take a molecule to be deflected through 90°, as a result of random encounters with other 
molecules. According to Maxwell’s analysis, molecules could be replaced by centres of 
repulsion, or, in his words, ‘mere points, or pure centres of force endowed with inertia’ — it 
was no longer necessary to make any special assumption about molecules as hard, elastic 
spheres. To express this insight more provocatively, the concept of collisions between 
particles was replaced by interactions between fields of force. 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:51:52 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.002 
Cambridge Books Online © Cambridge University Press, 2014 





10 


Physics and theoretical physics in 1895 


1.3.3 The kinetic theory of gases and the law of increase of entropy 


Another problem concerned the origins of the spectral lines observed in atomic and molec- 
ular spectra. If these were associated with internal resonances within molecules, then 
presumably these provided further means by which energy could be stored in the gas ac- 
cording to the principle of the equipartition of energy, which awards an average of kT of 
energy to each degree of freedom in equilibrium. Consequently, the number of degrees of 
freedom N per molecule would increase and the ratio of specific heat capacities, 


Cp _3NkT+KT N+2 
Cry ur N’ 





y= (1.3) 


would tend to unity. The fact that the kinetic theory, and specifically the equipartition 
theorem, could not satisfactorily account for all the properties of gases was a major barrier 
to the acceptance of the kinetic theory. Furthermore, there was no direct experimental 
evidence for the existence of atoms and molecules. 

The status of atomic and molecular theories of the structure of matter came under attack 
from a small number of prominent physicists, including Ernst Mach, Wilhelm Ostwald, 
Pierre Duhem and Georg Helm who rejected the approach of interpreting the laws of 
macroscopic physics at the microscopic level. Their approach was based on the concept of 
‘energetics’, in which only energy considerations were invoked in understanding physical 
phenomena, in clear conflict with those who favoured atomic and molecular theories. Most 
late nineteenth century physicists were, however, of the view that, although the details were 
not quite right, the atomic and molecular hypothesis was indeed the way ahead. 

In 1867, Maxwell first presented his famous argument by which he demonstrated how it 
is possible to transfer heat from a colder to a hotter body on the basis of the kinetic theory of 
gases, in violation of the strict application of the law of increase of entropy (Maxwell, 1867). 
This argument is commonly referred to as involving ‘Maxwell’s demon’.’ The Maxwell 
velocity distribution describes the range of velocities which inevitably must be present in a 
gas in thermal equilibrium at temperature T. He considered a vessel divided into two halves, 
A and B, the gas in A being hotter than that in B with a small hole drilled in the partition 
between them. Whenever a fast molecule moves from B to A, heat is transferred from the 
colder to the hotter body without the influence of any external agency. It is overwhelmingly 
more likely that hot molecules move from A to B and in this process heat flows from 
the hotter to the colder body with the consequence that the entropy of the whole system 
increases. According to the kinetic theory of gases, however, there is a very small but finite 
probability that the reverse will happen spontaneously and entropy will decrease in this 
natural process. 

In the late 1860s, Clausius and Boltzmann attempted to derive the second law of ther- 
modynamics from mechanics, an approach known as the dynamical interpretation of the 
second law. The dynamics of individual particles were followed in the hope that they would 
ultimately lead to an understanding of the origin of the second law. Maxwell rejected this 
approach as a matter of principle because of the simple but compelling argument that 
Newton’s laws of motion and Maxwell’s equations for the electromagnetic field are time 
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reversible and consequently the irreversibility implicit in the second law cannot be ex- 
plained by a dynamical theory. The second law could only be understood as a statement 
about the statistical behaviour of an immense number of particles. 

Eventually, Boltzmann accepted Maxwell’s doctrine concerning the statistical nature of 
the second law and set about working out the formal relationship between entropy and 
probability, 


S=klnp, (1.4) 


where S is the entropy, p the probability of that state and k is a universal constant. In 
his analysis, the value of the constant k, Boltzmann’s constant, was not known. Boltz- 
mann’s analysis is of considerable mathematical complexity and, indeed, this was one of 
the problems which stood in the way of the scientists of his day fully appreciating the deep 
significance of what he had achieved. Among those who did was Josiah Willard Gibbs 
who, in his fundamental text Elementary Principles of Statistical Mechanics, demonstrated 
that systems of huge numbers of particles did indeed tend to thermodynamic equilibrium, 
although there were necessarily fluctuations about the average properties of the gas at every 
stage of the evolution towards equilibrium (Gibbs, 1902). 


1.4 Maxwell's equations for the electromagnetic field 
E 


One of the unquestioned triumphs of nineteenth century physics was Maxwell’s discovery 
of the equations for the electromagnetic field. Building on the brilliant experimental in- 
vestigations of Michael Faraday, Maxwell discovered the equations for the electromagnetic 
field:? 


0B 
|1 E = -—, 1.5 
cur PP (1.5) 
aD 
curl H = J+ —, (1.6) 
ot 
div D = p, (1.7) 
div B =0. (1.8) 


In this modern notation, D and B are the electric and magnetic flux densities, E and 
H are the electric and magnetic field strengths and J is the current density. The term 
ə D/ðt is the famous displacement current, originally introduced on the basis of Maxwell’s 
mechanical model for material media, or vacua (Maxwell, 1861a,b, 1862a,b). All mention 
of the mechanical origins of his model disappears from the final version of the theory in 
his great paper A dynamical theory of the electromagnetic field (Maxwell, 1865). 
Maxwell’s was only one of a number of theories of the electromagnetic field and it took 
some time before the full validation of his theory came about. The remarkable result of 
Maxwell’s calculations was that the speed of light in a vacuum depended only upon the 
fundamental constants of electrostatics eo, the permittivity of free space, and magnetostatics 
uo, the permeability of free space, through the relation c = (€9j9)!/?. The determination 
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Hertz’s apparatus for the generation and detection of electromagnetic radiation. The emitter a produced 
electromagnetic radiation in discharges between the spherical conductors. The detector b consisted of a similar device 
with the jaws of the detector placed as close together as possible to achieve maximum sensitivity. The emitter was 
placed at the focus of a cylindrical paraboloid reflector to produce a directed beam of radiation (Hertz, 1893). 


of these quantities by laboratory experiments was a major concern of Maxwell and his 
successor as Cavendish Professor, John William Strutt, Lord Rayleigh. 

The final validation of the equations was obtained through the brilliant experiments of 
Hertz in the period 1887-1889, almost a decade after Maxwell’s death (Hertz, 1893). Hertz 
demonstrated that electromagnetic disturbances are propagated at the speed of light in free 
space (Fig. 1.4). In addition, these waves behaved in all respects exactly like light, the 
subject headings of his great book being rectilinear propagation, polarisation, reflection, 
and refraction. This was the final proof of the validity of Maxwell’s equations. Ironically, in 
the same experiments which provided confirmation of Maxwell’s theory, Hertz discovered 
the photoelectric effect, the liberation of cathode rays by ultraviolet and optical radiation, 
which was to prove to be evidence for the fact that, in some circumstances, radiation behaves 
like particles. 

Maxwell resigned from his post at King’s College London in 1865 in order to look after 
the family estate at Glenlair in the Scottish borders and to continue his studies into his 
many areas of scientific interest. During the succeeding years, he put an enormous effort 
into the writing of his Treatise on Electricity and Magnetism (Maxwell, 1873). The Treatise 
is unlike many of the other great texts such as Newton’s Principia Mathematica in that it 
is not a systematic presentation of the subject but a work in progress, reflecting Maxwell’s 
extensive approach to research. In a later conversation, Maxwell remarked that the aim of 
the Treatise was not to expound his theory finally to the world, but to educate himself by 
presenting a view of the stage he had reached. Maxwell’s somewhat disconcerting advice 
was to read the four parts of the Treatise in parallel rather than in sequence. 
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1.5 The Michelson—Morley experiment 


One of the most important results appears for the first time in Part 4, Sect. 792 of 
Volume 2 in which Maxwell worked out the pressure which radiation exerts on a conductor. 
This profound result provides the relation between the pressure p and the energy density 
& of a ‘gas’ of electromagnetic radiation p = le, derived entirely from Maxwell’s electro- 
magnetic theory.!? This key result was used by Boltzmann in his paper of 1884 in which 
he derived the Stefan—Boltzmann law from classical thermodynamics (Sect. 1.7.1). 

But more than experimental validation of the equations was needed. A new and different 
perspective had to be adopted to reveal the full power of Maxwell’s equations for the 
electromagnetic field. As Freeman Dyson has written, 


‘“Maxwell’s theory had to wait for the next generation of physicists, Hertz and Lorentz and 
Einstein, to reveal its power and clarify its concepts. The next generation grew up with 
Maxwell’s equations and was at home with a Universe built out of fields. The primacy 
of fields was as natural to Einstein as the primacy of mechanical structures had been for 
Maxwell.’ (Dyson, 1999) 


1.5 The Michelson—Morley experiment and the theory of relativity 


The unification of light and electromagnetism encouraged experimental physicists to find 
evidence for the medium through which electromagnetic phenomena are propagated, the 
aether. The most famous experiment was the pioneering interferometric measurements of 
Michelson (Fig. 1.5a). The final result published by Michelson and Edward Morley (1887) 
showed no evidence for the drift of the aether relative to the arms of the interferometer, the 
level of significance of this famous null result being quite enormous (Fig. 1.55). 

One of the concerns about Maxwell’s theory of the electromagnetic field was that the 
forms of the equations are not invariant with respect to the Galilean transformations of 
Newtonian physics. For this reason, Maxwell’s equations were regarded as being ‘non- 
relativistic’ — the equations would only hold good in a single preferred frame of reference 
and in all others there would be additional terms.!! Woldmar Voigt (1887) was aware 
of the null result of Michelson’s experiments and in 1887 published an analysis of the 
transformations between inertial frames of reference with the assumption that the speed 
of propagation of the waves was the same in all inertial frames. To express his insight in 
different terms, he was the first physicist to work out the set of transformations which would 
leave the wave equation form invariant. He found the following transformations: 


; Vx 

Dre age 1/2 

raa E az, 

X =x-ct, y=(1-— (1.9) 
y=y/y, g 


where V is the relative velocity of two inertial frames of reference. These transformations 
are almost identical to the standard Lorentz transformations of special relativity, but little 
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ren = — 0.054 Noon 





a Pa 
(a) (b) 

(a) The Michelson—Morley experiment of 1887. (b) The null result of the Michelson—Morely experiment. The solid lines 

show the average movement of the central fringe as the apparatus was rotated through 360°. The broken line shows 

one eighth of the sinusoidal variation expected if the Earth moved through a stationary aether at 30 km s7! 

(Michelson, 1927). 


attention was paid to his analysis.!* Voigt showed that these transformations reduce to the 
standard expression for the Doppler shift in the limit V « c. 

Explanations of the null result of the Michelson—Morley experiment were proposed 
independently by Fitzgerald and Hendrik Lorentz. In his brief note of 1889, Fitzgerald 
suggested that a contraction of the length of the arm of the interferometer by a factor 
y = (1 — V?/c?)-'? in the direction of motion through the aether could account for the 
null result (Fitzgerald, 1889). Lorentz had agonised about the null result of the Michelson— 
Morley experiment and came to the same conclusion that it could be explained by a physical 
contraction of dimensions in the direction of motion of the apparatus through the aether 
(Lorentz, 1892a). Over the succeeding decade, he built the Fitzgerald—Lorentz contraction 
into his theory of the electron. One of the suggestive results of the electrodynamics of 
moving charges was that, according to Maxwell’s theory of the electromagnetic field, the 
field lines would be squashed perpendicular to the direction of motion'? by the same factor 
y. Eventually in 1904 Lorentz arrived at the complete expressions for the Lorentz trans- 
formations by a somewhat tortuous route!* (Lorentz, 1904), in stark contrast to the much 
deeper and simpler approach of Einstein in his great paper of 1905, On the electrodynamics 
of moving bodies (Einstein, 1905c). The Lorentz transformations established by Lorentz 
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and Einstein were 


“y= ; Vx 
=y =) 


y2\ 2 

x' = y(x — ct), s=(1-5) : (1.10) 
c 

y =y, 

7 =Z; 


These transformations were quickly adopted by the physics community and were to play a 
key role in unravelling the physics of the spectra of atoms and molecules over the coming 
decades. Notice that the reason for the similarity of Voigt’s and Lorentz’s transformations 
is the scale invariance of the wave equation 


2 
(v-o (1.11) 


under scaling relations dx > «x dx, dy > x dy, dz > «x dz and dt > x dt. 


1.6 The origin of spectral lines 
ae << 


The first decades of the nineteenth century marked the beginnings of quantitative exper- 
imental spectroscopy. The breakthrough resulted from the pioneering experiments and 
theoretical understanding of the laws of interference and diffraction of waves by Thomas 
Young. In his Bakerian Lecture of 1801 to the Royal Society of London, On the theory of 
light and colours, he used the wave theory of light of Christiaan Huygens to account for 
the results of interference experiments, such as his famous double-slit experiment (Young, 
1802). Among the most striking achievements of this paper was the measurement of the 
wavelengths of light of different colours using a diffraction grating with 500 grooves per 
inch. From this time onwards, wavelengths were used to characterise the colours in the 
spectrum. 

In 1802, William Wollaston made spectroscopic observations of sunlight and discovered 
five strong dark lines, as well as two fainter lines in the spectrum (Wollaston, 1802). The 
full significance of these observations only became apparent following the remarkable 
experiments of Joseph Fraunhofer. Fraunhofer’s motivation for studying the solar spectrum 
was his realisation that accurate measurements of the refractive indices of glasses should 
be made using monochromatic light. In his spectroscopic observations of the Sun, he 
rediscovered the narrow dark lines which would provide precisely defined wavelength 
standards. In his words, 


‘I wanted to find out whether in the colour-image (that is, spectrum) of sunlight, a similar 
bright stripe was to be seen, as in the colour-image of lamplight. But instead of this, I 
found with the telescope almost countless strong and weak vertical lines, which however 
are darker than the remaining part of the colour-image; some seem to be completely 
black.’ 
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%, Fraunheforss Denksokrr 91h —15. 


Fraunhofer's solar spectrum of 1814 showing the vast numbers of dark absorption lines. The colours of the various 
regions of the spectrum are labelled, as well as the letters A, a, B, C, D, E, b, F, G and H, indicating the most prominent 
absorption lines. The continuous line above the spectrum shows the approximate solar continuum intensity, as 
estimated by Fraunhofer (Fraunhofer, 1817a,b). 


He labelled the 10 strongest lines in the solar spectrum A, a, B, C, D, E, b, F, G and H 
and recorded 574 fainter lines between the B and H lines (Fig. 1.6) (Fraunhofer, 1817a,b), 
the notation still used today. From the technical point of view, a major advance was the 
invention of the spectroscope with which the deflection of light passing through the prism 
could be measured precisely. To achieve this, he placed a theodolite on its side and observed 
the spectrum through a telescope mounted on the rotating ring. 

The understanding of the dark lines in the solar spectrum had to await developments 
in laboratory spectroscopy. In his papers of 1817, Fraunhofer noted that the dark D lines 
coincided with the bright double line seen in lamplight. In 1849, Léon Foucault performed 
a key experiment in which sunlight was passed through a sodium arc so that the two spectra 
could be compared precisely. To his surprise, the solar spectrum displayed even darker 
D lines when passed though the arc than without the arc present (Foucault, 1849). He 
followed up this observation with an experiment in which the continuum spectrum of light 
from glowing charcoal was passed through the arc and the dark D lines of sodium were 
found to be imprinted on the transmitted spectrum. 

Ten years later, the experiment was repeated by Gustav Kirchhoff who made the further 
crucial observation that, to observe an absorption feature, the source of the light had to 
be hotter than the absorbing flame. These results were immediately followed up in 1859 
by his understanding of the relation between the emissive and absorptive properties of any 
substance, what is now known as Kirchhoff’s law of emission and absorption of radiation 
(Kirchhoff, 1859).!> This states that, in thermal equilibrium, the radiant energy emitted 
by a body at any frequency is precisely equal to the radiant energy absorbed at the same 
wavelength. Specifically, for isotropic radiation, the monochromatic emission coefficient jy 
is defined such that the increment in intensity d/, radiated into the solid angle dQ from the 
cylindrical volume dV of area dA and length d/ is 


di, dAdQ = j, dV dQ, (1.12) 
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where j, has units W m~? Hz”! sr=!. Since the emission is assumed to be isotropic, the 
volume emissivity of the medium is £, = Arrj,. We can write the volume of the cylinder 
dV = dA dl and so 


d = j dl. (1.13) 
The monochromatic absorption coefficient a, is defined by the relation 
d/, dA dQ = —a, J, dA dQ dl . (1.14) 
Kirchhoff showed that in thermodynamic equilibrium 
a,B,(T) = jv. (1.15) 


In other words, the emission and absorption coefficients for any physical process are 
related by the unknown spectrum of equilibrium radiation B,(7). This expression enabled 
Kirchhoff to understand the relation between the emission and absorption properties of 
flames, arcs, sparks and the solar atmosphere. In 1859, very little was known about the 
form of B,(T). As Kirchhoff remarked, 


‘It is a highly important task to find this function.’ 


This was one of the great experimental challenges for the remaining decades of the nine- 
teenth century. Kirchhoff’s profound insight was the beginning of a long and tortuous story 
which was to lead to Planck’s discovery of the formula for black-body radiation over 40 
years later. 

Throughout the 1850s, there was considerable effort in Europe and in the USA aimed 
at identifying the emission and absorption lines produced by different substances in flame, 
spark and arc spectra. Different elements and compounds possessed distinctive patterns 
of spectral lines and attempts were made to relate these to the lines observed in the 
solar spectrum. In 1859, for example, Julius Pliicker identified the Fraunhofer F line with 
the bright H£ line of hydrogen and the C line was more or less coincident with Ha, 
demonstrating the presence of hydrogen in the solar atmosphere. 

The most important work resulted from the studies of Robert Bunsen and Kirchhoff. In 
Kirchhoff’s great papers of 1861-1863 entitled /nvestigations of the solar spectrum and the 
spectra of the chemical elements, the solar spectrum was compared with the spark spectra 
of 30 elements using a four-prism arrangement with which it was possible to view both the 
spectrum of the element and the solar spectrum simultaneously (Kirchhoff, 1861, 1862, 
1863). Kirchhoff concluded that the cool, outer regions of the solar atmosphere contained 
iron, calcium, magnesium, sodium, nickel and chromium and probably cobalt, barium, 
copper and zinc as well. 

While the patterns of spectral lines provided the fingerprints for the presence of different 
elements in stars, a major challenge was to discover formulae which could describe the 
wavelengths of the lines in the spectra of different elements. The first and most impor- 
tant success was the spectrum of hydrogen. In his remarkable papers of 1885, the Swiss 
schoolmaster Johann Jakob Balmer used laboratory and astronomical spectra to describe 
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the wavelengths X of the lines in the spectrum of hydrogen by the expression: 


= 1.16 
-24 (1.16) 
where m = 3, 4, 5,... and Ao = 3645 Å. The constant Ao corresponds to the wavelength 
limit of the Balmer series. In terms of frequencies v or wavenumbers n = !, Balmer’s 
formula can be written 


2? 1 1 1 
vV = Vo 1-5 or nE ke 2 m P (1.17) 


where Ræ is known as the Rydberg constant. With the aid of subsequent astronomical ob- 
servations, Balmer was able to test his formula up to m = 16 and found precise agreement 
with (1.16) (Balmer, 1885).'° This was the first quantum mechanical formula to be dis- 
covered. Balmer died 15 years before the deep significance of his numerological discovery 
was appreciated by Niels Bohr. A similar regularity appeared in Pickering’s observation 
of 1896 of the star ¢ Puppis. He discovered a sequence of absorption lines resembling the 
Balmer series, which became known as the Pickering series (Pickering, 1896). He showed 
that the lines could be described by Balmer’s formula provided half-integral values of the 
principal quantum number m were used. 

Of the numerous formulae proposed to describe the lines in the spectra of different 
elements, two are of special significance. In 1889, Johannes Rydberg presented a memoir 
to the Royal Swedish Academy in which he proposed that the spectral lines of all series in 
atomic spectra could be described by the formula 


Roo 


n = n — ——— =» 
° (m+ BP 


(1.18) 
where m is a positive integer and Ro, the same Rydberg constant introduced in (1.17), was 
to be a constant for all series. The empirical constant jz is known as the quantum defect. The 
spectroscopists had found various regularities in the spectra of the elements, the strongest 
lines forming the principal series, while other regularities were found for broader lines 
which were known as the diffuse series and a sharp series of lines which were so named. 
For each series of each element, no and jz were estimated from the experimentally measured 
wavelengths, the quantum defect u lying in the range 0 < u < 1. If u = 0, the formula 
reduces to Balmer’s formula. The formula was applied successfully to the principal, diffuse 
and sharp series of the lines in the spectra of sodium, potassium, magnesium, calcium and 
zinc. This formula was later generalised to the following forms for the different series: 


Principal series Np = Roo [a +5)? — (m+ py’ | ; (1.19) 

Diffuse series na = Ro [2 +p)? -— (m+ dy *| ; (1.20) 

Sharp series ns = Ra [(2 + pP)” — (m + s)°] , (1.21) 

where m = 3, 4, 5, ... and the constants s, p and d are chosen to fit the observed spectra 


for each series. 
In 1908, these formulae were generalised further by Walther Ritz (1908) who formulated 
his combination principle according to which every spectral line could be expressed as the 
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difference of two so-called spectral terms, each of which depended upon an integer m as 
well as the constants appearing in (1.19)-(1.21). Thus, Ritz wrote: 


v = (2, p,w)—(m,5,c), (1.22) 


where the terms in brackets are generalisations of the terms in (1.18). The inference is that 
the spectral lines of any element include frequencies that are either the sum or the difference 
of the frequencies of two other lines. Thus, if the individual terms in the Ritz formula are A, 
B and C and the observed lines are (A — B) and (B — C), by adding the frequencies we find 
(A — B)+ (B — C) = (A — C) and by subtracting them (A — B) — (A — C) = (C — B). 
This proved to be a powerful tool for disentangling complex spectra of elements into their 
different constituent series. But, even more importantly, Ritz’s combination principle was 
to prove to lie at the very heart of the revolution which was to lead to the reformulation of 
the laws of physics at the atomic level (Sect. 11.3). 


1.7 The spectrum of black-body radiation 


Kirchhoff’s exhortation about the central importance of determining the equilibrium dis- 
tribution of radiation B(T) stimulated theoretical and experimental physicists to take up 
the challenge. The function B(T) is known as the black-body spectrum because it is the 
equilibrium spectrum of a perfect radiator and absorber in thermal equilibrium. The next 
task is to understand the preliminaries which were to lead to Planck’s epochal discovery 
in 1900. The first steps involved determining the total amount of radiation emitted by a 
black-body, the Stefan-Boltzmann law. 


1.7.1 The derivation of the Stefan—Boltzmann law 


Josef Stefan deduced the law which bears his name empirically from experimental data 
published by Tyndall in 1865 on the radiation of a platinum strip heated electrically to 
different temperatures. In 1879, he published a paper in which he stated: 


‘From weak red heat (about 525 C) to complete white heat (about 1200 C) the intensity of 
radiation increases from 10.4 to 122, thus nearly twelvefold (more precisely 11.7). This 
observation caused me to take the heat radiation as proportional to the fourth power of 
the absolute temperature. The ratio of the absolute temperature 273 + 1200 and 273 + 
525 raised to the fourth power gives 11.6.’ (Stefan, 1879) 


This relation refers to the total radiation emitted by the heated body, 


dE 
_ (=) = total radiant energy per second « T*. (1.23) 


In 1884, Boltzmann, Stefan’s pupil, deduced this law theoretically from considerations of 
classical thermodynamics (Boltzmann, 1884). Suppose a closed volume is filled only with 
electromagnetic radiation and that it contains a piston so that the ‘gas’ of radiation can be 
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compressed or expanded. If the heat dQ is added to the system, the total internal energy is 
increased by dU and work is done on the piston as it is pushed outwards slightly so that the 
volume increases by dV. By conservation of energy, 


dOo=dU+pd. (1.24) 
Now rearrange this relation by introducing the associated increase in entropy dS = dQ/T: 
T dS = dU + pd’. (1.25) 


This relation is converted into a partial differential equation by dividing through by dV at 


constant T 
as aU 
Fi —) = == ; 1.26 
(37), (Ft en 


We now use the Maxwell’s relation!” 


(2) -(7 
Eu JE 
ap _ (au 


Thus, if we know the relation between U and T, that is, the equation of state for the gas, 
the relation between temperature and energy density can be found. This relation for a ‘gas’ 
of PleU Toma ene radiation was derived by Maxwell in his Treatise on Electricity and 
Magnetism, p = 18, where ¢ is the energy density of radiation (see Sect. 1.4). 

Therefore, U = eV and p = ie and so, from equation (1.11), we find 


38) eV) de 
T 3 = Tas ir| — — lo 4g, 
( aT ( aV ) +46 3 (S5), Ea a 


Therefore, 


to recast this relation as 








d dT 
Z 4i Ine=4lnT; ex Tt. (1.28) 
E 


This is Boltzmann’s contribution to the Stefan-Boltzmann law. In modern form, the law 
is written J = o T4, where / is the radiant energy emitted per unit area per second from 
the surface of a black-body at temperature T. The experimental evidence for the Stefan— 
Boltzmann law was not particularly convincing in 1884 and it was not until 1897 that 
Lummer and Pringsheim undertook their very careful experiments which showed that the 
law was indeed correct with high precision. 


1.7.2 Wien’s displacement law and the spectrum of black-body radiation 


The spectrum of black-body radiation was poorly known until the turn of the century 
but there had already been some important work done on the theoretical form which the 
radiation law should have. In 1894, Wilhelm (Willy) Wien published his derivation of his 
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displacement law by a combination of electromagnetism and thermodynamics and this 
was to prove to be of central importance in the development of the theory of black-body 
radiation (Wien, 1894). 

First, consider the adiabatic expansion of a ‘gas’ of electromagnetic radiation. The flow 
of heat in or out of the system dQ = dU + pdV and in an adiabatic expansion, dO = 0. 
Further, since U = eV and p = le, 


d(e V) + tedV = 0; V de + edV +3edV; = 7. (1.29) 
Integrating, 
e = constant x V 3. (1.30) 
But & = aT* where a = 40 /c and hence 
TV!’ = constant. (1.31) 


Since V is proportional to the cube of the dimension of a spherical volume, V œ r? and so 
Teer, 

The next step is to work out the relation between the wavelength of the radiation and 
the volume of the enclosure. This is no more than the Doppler shift formula for radiation 
enclosed within the expanding volume V, A œ r, that is, the wavelength of the radiation 
increases linearly proportionally to the size of the spherical volume.'® We now combine 


this result with the relation T œ r7! to find 
Taxa. (1.32) 


This is one aspect of Wien 5 displacement law. If radiation is adiabatically expanded, the 
wavelength of radiation changes inversely with temperature if we follow a particular set 
of waves. In other words, the wavelength of the radiation is ‘displaced’ as the temperature 
changes. In particular, if we follow the maximum of the radiation spectrum it should follow 
the 7! law. This was found to be in agreement with experiment. 

Wien now went further and combined the Stefan-Boltzmann law with the law T œ A~! 
to set constraints on what the spectral form of the radiation had to be. My own version of 
the argument is as follows. The first step is to note that if any system of bodies is enclosed 
in a perfectly reflecting enclosure, eventually they will all come to the same temperature 
because of the emission and absorption of radiation. At the microscopic level, the system 
comes into thermal equilibrium, whatever the properties of the walls, as a result of the 
principle of detailed balance. The radiation comes into equilibrium with the objects in 
the enclosure so that as much energy is radiated as absorbed by the bodies per unit time 
and the equilibrium spectrum is a black-body spectrum. The radiation will be isotropic and 
so the only parameters which can characterise the radiation are the temperature T of the 
enclosure and the wavelength of the radiation A. 

If the black-body radiation is initially at temperature 7; and the enclosure is expanded 
adiabatically, then since an adiabatic expansion takes place infinitely slowly, the radiation 
takes up an equilibrium spectrum at all stages in the expansion until it reaches temper- 
ature 7>. The crucial point is that, in an adiabatic expansion, the radiation spectrum has 
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black-body form at all stages in the expansion from T; to 7. The unknown law for the 
radiation spectrum must therefore scale appropriately with temperature. 

Consider the radiation in the wavelength interval à; to A; + dA, and let its energy density 
be e = u(A,) da. Then, in the expansion, according to Boltzmann’s analysis, the energy 
associated with the radiation of any particular set of waves decreases as T4 and hence 


di) da oh 

“Qida (T) (1.33) 
u(A2) daz Ty 

But 4,7; = A27T> and hence da, = (T2/ T1) dà2. Therefore, 


u(Ai) _ u) 








T5 = T’ (1.34) 
that is, 
u(A) 
7 = constant , (1.35) 
and, since AT = constant, this can be rewritten as 
u(A)A° = constant. (1.36) 


Notice that u(A) is the energy density per unit wavelength interval in the radiation spectrum. 
Now the only combination of T and à which is a constant is the product AT and hence 
we can write that, in general, the constant on the right-hand side of (1.36) can only be 
constructed out of functions involving AT. Therefore the radiation law must have the form 


ul)? = f(AT), (1.37) 
or 
u(A)dA =A f(AT) da. (1.38) 


This is Wien 5 displacement law in its entirety and it sets constraints on the form of the 
radiation law for black-body radiation. In terms of frequency, the relation can be written 


u(A)dA = u(v)dv; A=c/v, dA= Ba dv, 
v 





3 
and hence 
u(v)dv = (AC < dv) (1.39) 
that is, 
u(v)dv = v3 f (=) dv. (1.40) 


This brilliant argument shows how far Wien was able to get using only rather general 
thermodynamic arguments. This work was new in 1894 when Planck first became interested 
in the problem of the spectrum of black-body radiation. 
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1.8 The gathering storm 


The triumphs of nineteenth century physics were hard-won and their full implications were 
still being understood. To the thorny problems described in the sections of this chapter were 
about to be added a whole new set of discoveries which were indeed to rock the foundations 
of the subject. Most physicists clung to the hope that the challenges described in this chapter 
would eventually be resolved within the context of classical physics, but even that hope was 
to prove illusory. Physics was about to enter a period when the foundations were cut away 
and it would be 30 years before the theory of quantum mechanics was discovered. Through 
this period of uncertainty, some of the greatest minds in physics pitted themselves against 
almost insuperable problems with courage and imagination. 
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2.1 The key role of experimental technique 


24 


1895 was not an arbitrary choice as the cut-off year for the history recounted in Chap. 1. Over 
the following years a number of experimental discoveries were made which were to change 
dramatically the face of physics. The origins of these discoveries can be traced to the need for 
increased precision in experimental physics. The industrial revolution and the widespread 
availability of electricity and electrical communication required more exact understanding 
of the physical properties of materials and also the establishment of international standards. 
These necessitated a much more professional approach to the teaching of experimental 
physics and its associated theory. As part of that movement the Clarendon Laboratory 
was founded in Oxford in 1868 and the Cavendish Laboratory in Cambridge in 1874. Of 
particular significance for this chapter was the foundation of the Physikalisch-Technische 
Reichsanstaldt in Berlin in 1887 with the task of providing precise measurements of the 
physical properties of materials which would be of importance to industry. There was an 
expectation that these laboratories would develop new techniques for undertaking precision 
measurements. 

New experimental techniques were developed thanks to the development of new tech- 
nologies. To mention only a few of the more important of these, better vacuums were 
available to physicists through the invention of the Geissler pump, invented by Johann 
Heinrich Wilhelm Geissler, a brilliant inventor and glass-blower, in about 1855. The vac- 
uum was produced by trapping air in a mercury column and forcing it down the column 
by the force of gravity. Typically pressures of 0.1 mm of mercury could be obtained 
within his vacuum tubes. Higher voltages could be placed across discharge tubes with the 
invention of the Rühmkorff coil which consisted of a transformer with a small number 
of windings on the primary and a very large number on the secondary. High precision 
spectroscopy was revolutionised by the superb diffraction gratings produced by Henry 
Rowland, who pioneered the construction of high precision ruling engines. Samuel Pier- 
point Langley perfected the techniques of infrared spectrometry through the invention of 
the platinum bolometer with which temperature changes as small as 1074 K could be 
measured. Although Langley’s interests were primarily astronomical, the techniques were 
used to great advantage in the determination of the spectrum of black-body radiation 
in the final decade of the nineteenth century. The combination of these and many other 
advances in technology and experimental technique were about to produce unexpected 
fruits. 
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2.2 1895-1900: The changing landscape of 
experimental physics! 
[OOO E) 


The vacuum tube was to play a central role in the discoveries of the 1890s. The basic 
apparatus consisted ofa thin-walled glass vacuum tube with positive and negative electrodes 
between which a high voltage could be maintained. It was realised in the early nineteenth 
century that gases are poor conductors of electricity since they are electrically neutral. 
At very low pressures and high voltages, however, electrical discharges were observed in 
vacuum tubes. Geissler tubes became popular scientific toys and revealed a remarkable range 
of coloured discharges. William Crookes began his systematic studies of the conduction 
of electricity through gases in 1879. The appearance of the discharge in the vacuum tube 
changed as the pressure of the gas decreased. The colourful displays disappeared as the 
pressure decreased but a current continued to flow. At low enough pressures, the walls 
of the tube glowed green due to phosphorescence. It was found that objects placed in 
the tube cast a shadow on the walls of the tube opposite the cathode. It was inferred 
that a stream of cathode rays must be dragged off the cathode causing the walls of the 
tube to glow. By 1895, it was established that the cathode rays were negatively charged 
particles which could be deflected by magnetic fields. In 1895 Jean-Baptiste Perrin collected 
the cathode rays inside a vacuum tube and found directly that they had negative electric 
charge. 


2.2.1 The discovery of X-rays 


In 1895, Wilhelm Conrad Röntgen discovered, by accident, that wrapped unexposed pho- 
tographic plates left close to discharge tubes were darkened. In addition, if a discharge tube 
were completely surrounded by thin black cardboard, fluorescent materials left close to it 
glowed in the dark. Röntgen came to the correct conclusion that both phenomena were 
associated with some new form of radiation emitted by the discharge tube, and he named 
these X-rays (Röntgen, 1895). His startling X-ray image of his wife’s hand, showing the 
bones of the hand and her massive ring, had an immediate impact (Fig. 2.1a). Overnight, 
X-rays became a matter of the greatest public interest and were very rapidly incorporated 
into the armoury of the doctor’s surgery. In 1896, there were already more than 1000 ar- 
ticles on X-rays. The X-rays were more penetrating than cathode rays, since they could 
blacken photographic plates at a considerable distance from the hot spot on the discharge 
tube, which was known to be their source. Their identification with ‘ultra-ultraviolet’ ra- 
diation was only convincingly demonstrated in 1906, when Charles Barkla found that the 
X-radiation was polarised (Barkla, 1906) and in 1912, when Max von Laue had the inspi- 
ration of looking for their diffraction by crystals (Friedrich et al., 1912; Laue, 1912), in the 
process opening up the new field of X-ray crystallography. Measurements of the interaction 
of X-rays with matter was to prove to be a central tool in disentangling the structure of 
atoms. 
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(a) The first X-ray picture of Rontgen’s wife's hand, illustrating the ability of X-rays to penetrate through soft tissues 
and reveal the bone structure. (b) Becquerel’s developed plate showing a strong image, despite the fact that the 
radioactive salt had not been exposed to sunlight. 


2.2.2 The discovery of radioactivity 


The association of X-rays with fluorescent materials led to the search for other sources of 
X-radiation. In 1896, Henri Becquerel, who came from a distinguished family of French 
physicists, tested several known fluorescent substances before investigating samples of 
potassium uranyl disulphate. The photographic plates were wrapped in several sheets of 
black paper, the phosphorescent material was exposed to sunlight and then the plate devel- 
oped to find if it had been darkened by X-rays. Becquerel’s remarkable discovery was that 
the plates were darkened even when the phosphorescent material was not exposed to light 
(Fig. 2.1b). This was the discovery of natural radioactivity (Becquerel, 1896). In further 
experiments carried out in the same year, Becquerel showed that the amount of radioactivity 
was proportional to the amount of uranium in the substance and that the radioactive flux of 
radiation was constant in time. Another important discovery was that the radiation from the 
uranium compounds discharged electroscopes. Pierre Curie and Marie Sktodowska-Curie 
repeated Becquerel’s experiments in 1897 and showed that the intensity of radioactivity was 
proportional to the amount of uranium in different samples. Other radioactive substances 
were soon identified. Thorium was discovered in 1898 (Schmidt, 1898). By concentrating 
uranium residues, the Curies discovered the new element polonium, named after Marie’s 
native land. The new element decreased in radioactivity exponentially — radioactive sub- 
stances have a certain half-life. In September 1898, very strong radioactivity was found in 
the barium group of residues. Sufficient of the radioactive substance was isolated to show 
that a new element had been discovered, radium (Curie and Sktodowska-Curie, 1898; Curie 
et al., 1898). 
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As soon as the discovery of radioactivity was announced, Ernest Rutherford took up 
the study of radioactivity. In his first publication on the subject, Rutherford established 
that there are at least two separate types of radiation emitted by radioactive substances 
(Rutherford, 1899). He called the component which is most easily absorbed «-radiation (or 
a-rays) and the much more penetrating component ß-radiation (or 6-rays). It took another 
10 years before Rutherford conclusively demonstrated that the a-radiation consisted of 
the nuclei of helium atoms (Rutherford and Royds, 1909). In contrast, the -rays were 
convincingly shown by Walter Kaufmann to have the same mass-to-charge ratio as the re- 
cently discovered electron (Kaufmann, 1902). y-radiation was discovered in 1900 by Paul 
Villard as an extremely penetrating form of radiation emitted in radioactive decays — the 
y -rays were undeflected by a magnetic field (Villard, 1900a,b). The y -rays were conclusively 
identified as electromagnetic waves 14 years later when Rutherford and Edward Andrade 
observed the reflection of y-rays from crystal surfaces (Rutherford and Andrade, 1913). 

The a-, B- and y-rays were the only known radiations which could cause the ionisation 
of air. The characteristic property which distinguished them was their penetrating power. 
In quantitative terms, these were: 


e The a-particles ejected in radioactive decays produce a dense stream of ions and are 
stopped in air within about 0.05 m. This is called the range of the particles. 

e The £-particles have greater ranges, but there is not a well-defined value for any particular 
radioactive decay. 

e The y-rays were found to have by far the longest ranges, a few centimetres of lead being 
necessary to reduce their intensity by a factor of 10. 


2.2.3 The discovery of the electron 


With Röntgen’s announcement of the discovery of X-rays, John Joseph (JJ) Thomson im- 
mediately directed his research efforts to the study of electrical discharges in gases. Exper- 
imenting with cathode rays was difficult in Thomson’s time because a very good vacuum 
was required. Otherwise, the cathode rays collided with air molecules, ionising the gas 
and making it a conductor which shielded the rays. Thomson recognised the necessity of 
extremely low pressures. He and his assistant Ebeneezar Everett, who built all his apparatus 
and carried out the experiments, took great pains to secure excellent vacua and to eliminate 
the effects of outgassing from the materials of the tube (Fig. 2.2). 

The discovery of the electron in 1897 is traditionally attributed to Thomson on the basis 
of his famous series of experiments, in which he established that the charge-to-mass ratio, 
e/me, of cathode rays is about two-thousand times that of hydrogen ions. At the same time, 
several physicists were hot on the trail. 


e In 1896, Pieter Zeeman discovered the broadening of spectral lines when a sodium flame 
is placed between the poles of a strong electromagnet. Lorentz interpreted this result in 
terms of the splitting of the spectral lines due to the motion of the ‘ions’ in the atoms 
about the magnetic field direction — he found a lower limit of 1000 for the value of e/me 
of the‘ions’. 
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The vacuum tube with which J. J. Thomson measured the charge-to-mass ratio of cathode rays. 


e In January 1897, Emil Wiechert used the magnetic deflection technique to obtain a 
measurement of e/me for cathode rays and concluded that these particles had mass 
between 2000 and 4000 times smaller than that of hydrogen, assuming their electric 
charge was the same as that of hydrogen ions. He obtained only an upper limit to the 
speed of the particles since it was assumed that the kinetic energy of the cathode rays 
was Ekin = eV, where V is the accelerating voltage of the discharge tube. 

e Walter Kaufmann’s experiment was similar to Thomson’s. He found the same values of 
e/m., no matter which gas filled the discharge tube, a result which puzzled him. He found 
a value of e/m. 1000 times greater than that of hydrogen ions. He concluded 


‘,.. that the hypothesis of cathode rays as emitted particles is by itself inadequate for a 
satisfactory explanation of the regularities I have observed.’ 


Thomson was the first of these pioneers to interpret the experiments in terms of a 
sub-atomic particle. In his words, cathode rays constituted 


“...a new state, in which the subdivision of matter is carried very much further than in 
the ordinary gaseous state.’ (Thomson, 1897) 


Thomson further showed that the particles ejected in the photoelectric effect, discovered 
by Hertz in 1887, were also identical with electrons. In 1898, Thomson modified one of 
C. T. R. Wilson’s early cloud chambers to measure the charge of the electron. He counted 
the total number of droplets formed and their total charge. From these, he estimated 
e = 2.2 x 107! C, compared with the present standard value of 1.602 x 107! C. This 
experiment was the precursor of the famous Millikan oil drop experiment, in which the 
water-vapour droplets were replaced by fine drops of a heavy oil, which did not evaporate 
during the course of the experiment. Thus, Thomson pursued a more sustained and detailed 
campaign than the other physicists in establishing the universality of what became known 
as electrons, the name coined for cathode rays by Johnstone Stoney in 1891 (Stoney, 1891). 
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2.3 Planck and the spectrum of black-body radiation 
E) 


The discoveries in experimental physics summarised in the last section form the background 
to Planck’s research into the nature of black-body radiation — these were the cutting- 
edge areas of research which had uncovered a whole new range of phenomena which 
required theoretical explanation. Up to 1894, Planck’s researches had been principally 
into elucidating the nature of classical thermodynamics and the law of increase of entropy. 
Following the death of Kirchhoffin 1889, he succeeded to Kirchhoff’s chair at the University 
of Berlin. This was a propitious advance since Berlin was one of the most active centres of 
physics research in the world. 

In 1894, Planck turned his attention to the problem of the spectrum of black-body 
radiation. It is likely that his interest in this problem was stimulated by Wien’s important 
paper of that year, which was discussed in Sect. 1.7.2 (Wien, 1894). Wien’s analysis has a 
strong thermodynamic flavour, which must have appealed to Planck. 

In 1895, Planck published the first results of his work on the resonant scattering of 
plane electromagnetic waves by an oscillating dipole. This was the first of Planck’s papers 
in which he diverged from his previous areas of interest in that it appeared to be about 
electromagnetism rather than entropy. In the last words of the paper, however, Planck made 
it clear that he regarded this as a first step towards tackling the problem of the spectrum of 
black-body radiation. His aim was to set up a system of oscillators in an enclosed cavity 
which would radiate and interact with the radiation produced so that after a long time the 
system would come into equilibrium. He could then apply the laws of thermodynamics to 
black-body radiation with a view to understanding the origin of its spectrum. He explained 
why this approach offered the prospect of providing insights into basic thermodynamic 
processes. When energy is lost by an oscillator by radiation, it is not lost as heat, but 
as electromagnetic waves. The process can be considered ‘conservative’ because, if the 
radiation is enclosed in a box with perfectly reflecting walls, it can then react back on 
the oscillator.? Furthermore, the process is independent of the nature of the oscillator. In 
Planck’s words, 


“The study of conservative damping seems to me to be of great importance, since it opens 
up the prospect of a possible general explanation of irreversible processes by means 
of conservative forces — a problem that confronts research in theoretical physics more 
urgently every day.” 


Planck believed that, by studying classically the interaction of oscillators with electromag- 
netic radiation, he would be able to show that entropy increases absolutely for a system 
consisting of matter and radiation. These ideas, which were elaborated in a series of five 
papers, did not work, as was pointed out by Boltzmann. One cannot obtain a monotonic 
approach to equilibrium without some statistical assumptions about the way in which the 
system approaches the equilibrium state. This can be understood from Maxwell’s simple but 
compelling argument concerning the time reversibility of the laws of mechanics, dynamics 
and electromagnetism. 
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Finally, Planck conceded that statistical assumptions were necessary and introduced the 
concept of ‘natural radiation’, which corresponded to Boltzmann’s assumption of ‘molec- 
ular chaos’. Once the assumption was made that there exists a state of ‘natural radiation’, 
Planck was able to make considerable progress with his programme. The first thing he 
did was to relate the energy density of radiation in an enclosure to the average energy 
of the oscillators within it. This is a very important result and Planck derived it entirely 
from classical arguments concerning the emission and absorption of radiation by a set of 
oscillators. 

Why did Planck treat oscillators rather than, say, atoms, molecules, lumps of rock, and so 
on? The reason is that, in thermal equilibrium, everything is in equilibrium with everything 
else —rocks are in equilibrium with atoms and oscillators and therefore there is no advantage 
in treating complicated objects. The advantage of considering simple harmonic oscillators 
is that the radiation and absorption laws can be calculated exactly. To express this important 
point in another way, Kirchhoff’s law (1.15), which relates the emission and absorption 
coefficients, jy and a, respectively, 


a, BT) = jy, 


shows that, if we can determine these coefficients for any process, we can find the universal 
equilibrium spectrum B,(7). 

I have given a detailed derivation of Planck’s relation between the mean energy of the 
oscillator and the spectrum of radiation in thermal equilibrium with it in Chap. 12 of TCP2. 
Here I summarise some of the key results and aspects of the analysis which will prove 
important in later parts of the story. 


2.3.1 The rate of radiation of an oscillator 


Charged particles emit electromagnetic radiation when they are accelerated. The total rate 
of loss of energy by an accelerated electron is 


E 2 2 . 2 
1 ) b? _ ie as 
rad 


dt ~ 6re9c3 6ra’ 





where p = er is the dipole moment with respect to some origin and e the electric charge 
of the electron — note that the radiation depends only upon the instantaneous acceleration 
F of the electron. This result is often referred to as Larmor 5 formula. The three essential 
properties of the radiation of an accelerated charged particle are: 


(a) The total radiation rate is given by Larmor’s formula (2.1) in which the acceleration is 
the proper acceleration of the charged particle and the radiation loss rate is measured 
in the instantaneous rest frame of the particle. 

(b) The polar diagram of the radiation is of dipolar form, that is, the electric field strength 
varies as sind and the power radiated per unit solid angle varies as sin? 6, where 6 is 
the angle with respect to the acceleration vector of the particle. There is no radiation 
along the direction of the acceleration vector and the field strength is greatest at right 
angles to it. 
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(c) The radiation is polarised with the electric field vector lying in the direction of the 
acceleration vector of the particle, as projected onto a sphere at distance r. 


We now apply the result (2.1) to the case of an oscillator performing simple harmonic 
oscillations with amplitude xo at angular frequency wo, x = |xo| exp(iwt). Therefore, x = 





—w|xo| exp(iwr), or, taking the real part, ¥ = —@5|xo| cos wt. The instantaneous rate of 
loss of energy by radiation is therefore 
dE ape? |xol? > 
= cos” Wot . 2.2 
( dt ) 6T Eoc? (2.2) 


The average value of cos? wt is 5 and hence the average rate of loss of energy in the form 
of electromagnetic radiation is 


af _ Belo (2.3) 
dt J average 12m eoc? 
As shown below, the energy E of the oscillator is E = im lxo] o and so 


= S ye, (2.4) 


where y = ape? /6m€9c*m. This expression can be rewritten in terms of the classical elec- 
tron radius r, = e’/4nregm.c?. Thus y = 2rew3/3c, where me is the mass of the electron. 


2.3.2 Radiation damping of an oscillator 


The equation of damped simple harmonic motion can be written in the form 
mx +ax+kx =0, 


where m is the (reduced) mass, k the spring constant and ax the damping force. The energies 
associated with each of these terms is found by multiplying through by x and integrating 
with respect to time. 


t t t 
im f ad f až? dt + zf d(x”) =0. (2.5) 
0 0 0 


We identify the terms in (2.5) with the kinetic energy, the damping energy loss and the 
potential energy of the oscillator respectively. For simple harmonic motion of the form 
x = |xo| cos wt, the average kinetic and potential energies are: 


average kinetic energy = 4m [xol w2; average potential energy = tk|xol? : 


If the damping is very small, the natural frequency of oscillation of the oscillator is w = 


k/m. Thus, the average kinetic and potential energies of the oscillator are equal and the 
total energy is the sum of the two, 


E = im|xo o. (2.6) 
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Inspection of the second term on the left-hand side of (2.5) shows that the average rate of 
loss of energy by radiation from the oscillator is 


dE 
z (>) = laxoa = E. (2.7) 
rad m 


This relation can now be compared with the expression for the loss rate by radiation of the 
oscillator (2.4). We obtain the correct expression for the decay in amplitude of the oscillator 
by identifying y with a/m in (2.7). This phenomenon is known as the radiation damping 
of an oscillator. 

We can appreciate now why Planck believed this was a fruitful way of attacking the 
problem. The radiation loss does not go into heat, but into electromagnetic radiation and 
the constant y depends only upon fundamental constants, if we take the oscillator to be an 
oscillating electron. In contrast, in frictional damping, the energy goes into heat and the 
loss rate formula contains constants appropriate to the material. Furthermore, in the case 
of electromagnetic waves, if the oscillators and waves are confined within an enclosure 
with perfectly reflecting walls, energy is not lost from the system and the waves can react 
back on the oscillators returning energy to them.* This is why Planck called the damping 
conservative damping. 


2.3.3 The equilibrium radiation spectrum of a harmonic oscillator 


The expressions for the dynamics of an oscillator undergoing radiation, or natural, damping 
which have just been derived are: 


mit+ax+kx=0; €+yx+apx =0. (2.8) 


If an electromagnetic wave is incident on the oscillator, energy can be transferred to it and 
a forcing term is added to (2.8), 


R j 2 F 
X¥+yxX+aox=—. (2.9) 
m 
If the oscillator is accelerated by the E field of an incident wave, F = e Eo exp(iwt). To find 


the response of the oscillator, a trial solution for x of the form x = |xo| exp(iwr) is adopted. 
Then 





E 
le (2.10) 
m ( 


w — @ + iyo) 
Notice that there is a complex factor in the denominator and this means that the oscillator 
does not vibrate in phase with the incident wave. This does not matter for our calculation. 
We are only interested in the square of the modulus of the amplitude. 

Now, let us work out the rate of radiation of the oscillator under the influence of the 
incident radiation field. If we set this equal to the ‘natural’ radiation of the oscillator, we 
will have found the equilibrium spectrum — the work done by the incident radiation field is 
just enough to supply the energy loss per second by the oscillator. From (2.2), the rate of 
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radiation of the oscillator is 





dt 67 Eoc? 


where we now use the value of |xo| derived in (2.10). We need the square of the modulus 
of xo and this is found by multiplying xo by its complex conjugate, that is, 


2 eE’ 
Ixo = 3 . 
mi K -02) + yo | 
Therefore, the radiation rate is 


(=) _ wre Ee (2.11) 
dt 127 €9c3m2 [ («2 — o) + yo] 


(5) ten 3 
= cos’ wt , 








where we have taken averages over time, that is, (cos? wt) = L, 
We next replace the value of Z3 in our formula by the sum over all the waves of angular 
frequency w incident upon the oscillator to find the total average reradiated power, 
dE te. 
( )- 2d > (2.12) 
dr 6m €yc3 m2 [ («2 - 02) + y% 





The next step is to note that the loss rate (2.12) is part of a continuum intensity distribution 
and so this sum can be written as an incident intensity’ in the frequency band w to w + da, 
that is, 


I(w) dw = se0c X Eg. (2.13) 


Therefore, the total average radiation loss rate is 


(5) u wet I(w) dw u Sr? w Iw) dw (2.14) 
dt) 6re2c4m? [ («2 =) 4 yo] 03 [ (8 - @) + yo | 





where we have introduced the classical electron radius, re = e?/4rregmec. 


Now the response curve of the oscillator is very sharply peaked about the value wo, 
because the radiation rate is very small in comparison with the total energy of the oscillator, 
that is, y < 1 (see Fig. 2.3). We can therefore make some simplifying approximations. 
If w appears on its own, œ > wo and (© — a?) = (w + @)(wo — w) © 2a (wo — w). 
Therefore, 





= ; 2.15 
dr 3 [@- a) + (y7/4)] nn 


Finally, we expect [(w) to be a slowly varying function in comparison with the sharpness 
of the response curve of the oscillator and so we can set /(w) equal to a constant over the 
range of values of w of interest, 


(5) u Int, Ko Ji wi dw 216) 
d) 3 “Jo [(@ - wo) +04] 


(5) Inr? © I(w) dw 
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Intensity 


Iwo) 





Illustrating the response curve of the oscillator to waves of different frequencies. 


The integral is easy if we set the lower limit equal to minus infinity which is perfectly 
permissible since the function tends very rapidly to zero at frequencies away from that of 


the peak intensity. Using 
f ~ dx x 
wo Xt ta a’ 


the integral (2.16) becomes 277/y and so 





dE\ _2nwpr: 2 4n*wore 
( )- wars 20 wor; Has (2.17) 


d) 3 y LY = 3y 

The rate of radiation should now be set equal to the spontaneous radiation rate of the 
oscillator. There is only one complication. We have assumed that the oscillator can re- 
spond to incident radiation arriving from any direction. For a single oscillator with axis in, 
say, the x-direction, there are directions of incidence in which the oscillator does not re- 
spond to the incident wave — this occurs if a component of the incident electric field is 
perpendicular to the dipole axis of the oscillator. We get round this problem by the argu- 
ment used by Richard Feynman. Suppose three oscillators are mutually at right angles to 
each other. Then this system can respond like a completely free oscillator and follow any 
incident electric field. Therefore, we obtain the correct result if we suppose that (2.17) is 
the radiation which would be emitted by three mutually perpendicular oscillators, each os- 
cillating at frequency wo. Equating (2.17) to three times (2.4), we find the rather remarkable 
result: 


4x? wor? 2r.w, 
— E I(@) =3yE where y= 5 
3y 3c 
and hence 
2 
wo 
Io) = GE. (2.18) 

m?c 
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Writing (2.18) in terms of the spectral energy density u(ay),° 


Io) _ w? 
md 








u(@o) = (2.19) 


We can now drop the subscript zero on wo since this result applies to all frequencies in 
equilibrium. In terms of frequencies: 


we? 
u(w)dw = u(v)dv = — Edo, 
wc 


that is, 


87 v2 





u(r) = —z (2.20) 


This is the remarkable result published by Planck in June 1899 (Planck, 1899). All 
information about the nature of the oscillator has completely disappeared from the problem. 
All that remains is the average energy of the oscillator. The meaning behind the relation is 
obviously very deep and fundamental in a thermodynamic sense. The whole analysis has 
proceeded through a study of the electrodynamics of oscillators, and yet the final result 
contains no trace of the means by which we arrived at the answer. One can appreciate how 
excited Planck must have been when he discovered this basic result — if we can work out 
the average energy of an oscillator of frequency v in an enclosure at temperature T, we can 
find immediately the spectrum of black-body radiation. 


2.3.4 Rayleigh and the spectrum of black-body radiation 


In fact, the same relation (2.20) between the average energy of an oscillator and the 
intensity of radiation in thermal equilibrium was also worked out by a completely different 
route by John William Strutt, Baron Rayleigh in 1900.’ Rayleigh was the author of the 
famous book The Theory of Sound (1894) and so approached the problem from the point 
of view of the equilibrium distribution of waves in a box (Rayleigh, 1900). Suppose the 
box is a cube with sides of length L. Inside the box, all possible wave modes consistent 
with the boundary conditions are allowed to come into thermodynamic equilibrium at 
temperature 7. The wave equation for the waves in the box is 
ey Py ay 1 ay 


2 a: PER 
ee i ae dee ee een 





where c, is the speed of the waves. The walls are fixed and so the waves must have zero 
amplitude y = Oatx, y,z = Oandx, y,z = L. The solution of this problem is well-known: 


Liot „ [TX . MTY . NTZ 
y = Ce™ sin T sin T sin T’ (2.22) 


corresponding to standing waves which fit perfectly into the box in three dimensions, 
provided /, m and n are integers. Each combination of /, m and n is called a mode of 
oscillation of the waves in the box. These modes are complete, independent and orthogonal 
and so represent all the ways in which the medium can oscillate within the box. 
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We now substitute (2.22) into the wave equation (2.21) and find the relation between the 
values of /, m, n and the angular frequency of the waves, w, 


2 


D 2 259 
= al +m? +n?) = 


np 
L2 ’ 





(2.23) 


where p? = I? + m? + n?. We thus find a relation between the modes as parameterised by 
p? = P + m? +n’ and their angular frequency w. According to the Maxwell-Boltzmann 
equipartition theorem, energy is shared equally among each independent mode and so we 
need to know the number of modes in the range p to p + dp. We find this by drawing a 
three-dimensional lattice in/, m, n space and evaluating the number of modes in the octant 
of a sphere in (/, m, n) space. If p is large, the number of modes in an octant of a spherical 
shell of radius p and width dp is, 


Lo? 
n(p)dp = In? dv. (2.24) 
The waves are electromagnetic waves and so there are two independent linear polarisations 
for any given mode. Therefore, there are twice as many modes as given by (2.24). 
According to the Maxwell-Boltzmann doctrine ofthe equipartition of energy, each mode 
of oscillation has average energy E. Then, the energy density of electromagnetic radiation 
in the box is: 


3,257 





= Lo E 
u(v) dv L? = En(p) dp = —— do, (2.25) 
TC 
that is, 
Srv? 
u(v) = — F, (2.26) 
C 


exactly the same result as that derived by Planck from electrodynamics (2.20). 

A number of features of Rayleigh’s analysis are worth noting. First of all, Rayleigh deals 
directly with the electromagnetic waves themselves, rather than with the oscillators which 
are the source of the waves and which are in equilibrium with them. Second, central to the 
result is the doctrine of the equipartition of energy. What the Maxwell-Boltzmann doctrine 
states is that, if the system is left long enough, irregularities in the energy distribution 
among the oscillations are smoothed out by various energy interchange mechanisms. In 
many natural phenomena, the equilibrium distributions are set up very quickly. 


2.4 Towards the spectrum of black-body radiation 
E) 


Planck’s next step is at first slightly surprising. Classically, in thermodynamic equilibrium, 
each degree of freedom is allocated SkT of energy and hence the mean energy of a harmonic 
oscillator should be E = kT, because it has two degrees of freedom, those associated with 
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the squared terms x? and x? in the expression for its energy. Setting E = kT , we find 


87 v2 


u(v) = 





3 AT, (2.27) 
ë 

exactly the result derived by Rayleigh. This turned out to be the correct expression for the 
black-body radiation law at low frequencies, the Rayleigh-Jeans law. Why did Planck not 
do this? First of all, the equipartition theorem of Maxwell and Boltzmann is a result of 
statistical thermodynamics and this was the point of view which Planck had rejected. As we 
discussed in Sect. 1.3, it was far from clear in 1899 how secure the equipartition theorem 
really was. Planck had already had his fingers burned by Boltzmann when he failed to note 
the necessity of statistical assumptions in order to derive the equilibrium state of black-body 
radiation (Sect. 2.3). He therefore adopted a rather different approach. To quote his words: 


‘T had no alternative than to tackle the problem once again — this time from the opposite 
side — namely from the side of thermodynamics, my own home territory where I felt myself 
to be on safer ground. In fact, my previous studies of the Second Law of Thermodynamics 
came to stand me in good stead now, for at the very outset I hit upon the idea of correlating 
not the temperature of the oscillator but its entropy with its energy ... While a host of 
outstanding physicists worked on the problem of the spectral energy distribution both 
from the experimental and theoretical aspect, every one of them directed his efforts solely 
towards exhibiting the dependence of the intensity of radiation on the temperature. On the 
other hand, I suspected that the fundamental connection lies in the dependence of entropy 
upon energy. ... Nobody paid any attention to the method which I adopted and I could 
work out my calculations completely at my leisure, with absolute thoroughness, without 
fear of interference or competition.’ (Planck, 1950) 


In March 1900, Planck worked out the following relation for the change in entropy of a 
system which is not in equilibrium (Planck, 1900a): 
_ 3 928 


AS = -+ AEdE. 2.2 
S 37E? d (2.28) 


This equation applies to an oscillator the entropy of which deviates from the maximum 
value in that an individual resonator deviates by an amount AE from the equilibrium energy 
E. The entropy change occurs when the energy of the oscillator is changed by dE. Thus, 
if AE and dE have opposite signs, so that the system tends to return towards equilibrium, 
the entropy change is positive and the function 07/9 E? must necessarily have a negative 
value. A negative value of 3?S/3 E? means that there is an entropy maximum and thus if 
AE and dE are of opposite signs, the system must approach equilibrium. 

To appreciate what Planck did next, we need to review the dramatic developments in the 
experimental determinations of the properties of black-body radiation. Otto Lummer had 
been appointed to the Physikalisch-Technische Reichsanstaldt in Berlin in 1887 and led 
the group involved in establishing standards of luminosity. There was no intention initially 
to determine the spectrum of black-body radiation, but this came about through the need 
to develop precision methods of measurement of optical and infrared intensities. He was 
joined by Ferdinand Kurlbaum in 1891 at the Reichsanstaldt and collaborated with Ernst 
Pringsheim, who was appointed to a titular professorship at the University of Berlin in 1896, 
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C m I. 
(a) Lummer and Pringsheim’s apparatus of 1897 for determining experimentally the Stefan-Boltzmann law with high 
precision (Allen and Maxwell, 1952). (b) The spectrum of black-body radiation plotted on linear scales of intensity and 
wavelength as determined by Lummer and Pringsheim in 1899 for temperatures between 700 and 1600 °C (Allen and 
Maxwell, 1952). 











and by Heinrich Rubens who was appointed to a professorship of physics at the Technische 
Hochschule of Berlin in 1896. 

This group made enormous advances in the determination of the properties of black-body 
radiation. In particular, they emphasised the importance of developing sources of radiation 
which were as uniform as possible. The preferred solution was to bring a cavity to as 
uniform a temperature as possible and then allow the radiation to pass outwards through an 
opening (Fig. 2.4a). In addition, Rubens pioneered the techniques of precise photometric 
measurement in the infrared regions of the spectrum which were to be crucial in the coming 
years. 

Wien (1896) followed up his studies of the spectrum of thermal radiation by attempting 
to derive the radiation law from theory. We need not go into his ideas, but he derived an 
expression for the radiation law which was consistent with his displacement law and which 
provided an excellent fit to all the data available in 1896. Wien’s theory suggested that, 
consistent with his displacement law, the spectrum should have the form 


8 
u(v) = ve BY? (2.29) 
C 


This is Wien 5 law, written in our notation. There are two unknown constants in the formula, 
a and £; the constant 87 /c? has been included on the right-hand side for reasons which 
will become apparent in a moment. Rayleigh’s comment on Wien’s paper was terse: 


“Viewed from the theoretical side, the result appears to me to be little more than a 
conjecture.’ (Rayleigh, 1900) 
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2.4 Towards the spectrum of black-body radiation 


The importance of the formula was that it gave an excellent account of all the experimental 
data available at the time and therefore could be used in theoretical studies. Typical black- 
body spectra are shown in Fig. 2.4b. The wavelength regions close to the maxima were best 
defined by the experimental data — the uncertainties in the wings of the energy distribution 
were much greater. 

Returning to Planck’s paper of March 1900, the next step was the definition of the entropy 
S of an oscillator, 


Vi, eae (2.30) 


where E is its energy, œ and £ are the constants in Wien’s law and e is the base of natural 
logarithms. In fact, Planck worked backwards from Wien’s law to determine this relation.® 

The next step was crucial for Planck. The second derivative of (2.30) with respect to E 
is needed in order to apply (2.28) to the black-body spectrum: 


ds. 11 


mae (2.31) 


B, v and E are all necessarily positive quantities and so 0°S/dE? is necessarily negative. 
Therefore, according to (2.28), Wien’s law is entirely consistent with the second law of 
thermodynamics. The simplicity of the expression for the second derivative of the entropy 
with respect to energy profoundly impressed Planck who remarked: 


‘I have repeatedly tried to change or generalise the equation for the electromagnetic 
entropy of an oscillator in such a way that it satisfies all theoretically sound electromagnetic 
and thermodynamic laws but I was unsuccessful in this endeavour.’ (Planck, 1900a) 


In his paper presented to the Prussian Academy of Sciences in May 1899 he stated: 


‘I believe that this must lead me to conclude that the definition of radiation entropy and 
therefore Wien’s energy distribution law necessarily result from the application of the 
principle of the increase of entropy to the electromagnetic radiation theory and therefore 
the limits of validity of this law, in so far as they exist at all, coincide with those of the 
second law of thermodynamics.’ (Planck, 1899) 


Planck had gone too far — in fact, many negative functions of energy would have satisfied 
Planck’s requirement so far as the second law of thermodynamics was concerned. 

These calculations were presented to the Prussian Academy of Sciences in June 1900. 
By October 1900, Rubens and Kurlbaum had shown beyond any doubt that Wien’s law was 
inadequate to explain the spectrum of black-body radiation at low frequencies and high 
temperatures. They showed that, at low frequencies and high temperatures, the intensity 
of radiation was proportional to temperature. This is clearly inconsistent with Wien’s 
law because, if u(v) x v? e~F"/", then for Bv/T «1, u(v) x v? and is independent of 
temperature. Therefore, the functional dependence must depart from (2.30) and (2.31) for 
small values of v/T. 

Rubens and Kurlbaum showed Planck their results before they were presented in October 
1900 and he was given the opportunity to make some remarks about their implications. The 
result was his paper entitled On an improvement of Wiens spectral distribution (Planck, 
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1900b). Here is what Planck did. At low frequencies v/ T — 0, the experiments of Rubens 
and Kurlbaum showed that u(v) x T. The thermodynamic relations between S, E and T are 
1 
ExT, a = —. 
dE T 
Hence 
dS zZ 1 ds re 1 
dE E’ dE? E2° 
Therefore, d?S/dE? must change its functional dependence upon E between large and 
small values of v/T. Wien’s law remains good for large values of v/T and leads to 
es 1 
ee 2.33 
dE? E een 
The standard technique for combining functions of the form (2.32) and (2.33) is to try an 
expression of the form 





(2.32) 


Es a 
= ; 2.34 
dE? E(b+E) 2) 
which has exactly the required properties for large and small values of E, E >> band E < b 
respectively. The rest of the analysis is straightforward.” Integrating with respect to E, 








£ = | ae Hu lin E In(b + E)] = = (2.35) 
and hence 
E= on (2.36) 
eb/aT _ | 
Then, 
8 v? b 





2 
u(v) = - E= (2.37) 
E 


& eb/aT _ 1° 
From the high frequency, low temperature limit, we can find the constants from Wien’s law 
(2.29): 


8nv2 b _ 800 v 


u(v) = et et 





Thus b must be proportional to the frequency v. We can now write Planck’s formula in its 
primitive form: 


Av’ 
(rn 


Of equal importance for the next part of the story is that Planck was also able to find an 
expression for the entropy of an oscillator by integrating dS/dE. From (2.35), 


i= Br (1 =) in( al 2.39 
== 707, +3 n +7 i (2.39) 


with b œ v. Planck’s new formula could now be confronted with the experimental evidence. 


u(v) = (2.38) 
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2.5 Comparison with experiment 


























È æ = Intensity measurements 
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Ss Planck formula 
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= = - — -Rayleigh formula 


sasas Wien formula 





Temperature (K) 


Comparison of the radiation formulae of Planck (2.38, solid line), Wien (2.29, dotted line) and Rayleigh (2.40, 
dot-dashed line) with the intensity of black-body radiation at 8.85 ım as a function of temperature as measured by 
Rubens and Kurlbaum (filled boxes). Similar experiments were carried out at longer wavelengths, (24 + 31.6) um and 
51.2 um, at which there was little difference between the predictions of the Planck and Rayleigh functions. (Replotted 
from the data presented by Rubens and Kurlbaum (1901) in the same format as their original presentation.) 


2.5 Comparison of the laws for black-body radiation with experiment 
A 


In their presentation of 25 October 1900, Rubens and Kurlbaum compared their new precise 
measurements of the spectrum of black-body radiation with five different formulae. Two 
of these were empirical relations proposed by Thiesen and by Lummer and Jahnke. The 
others were: 


e Wien’s relation (2.29); 

e Planck’s formula (2.38); 

e Rayleigh’s result (2.27), but modified to avoid the divergence of the Rayleigh—Jeans spec- 
trum at high frequencies, the problem known as the ultraviolet catastrophe. Rayleigh was 
well aware of ‘the difficulties which attend the Boltzmann—Maxwell doctrine of the 
partition of energy’ and that (2.27) must break down at high frequencies because the 
spectrum of black-body radiation does not increase as v? to infinite frequency. In 
the fifth paragraph of his short paper, he stated, ‘If we introduce the exponential fac- 
tor, the complete expression is’, in our notation, 


u(v) = re DET, (2.40) 


The data presented by Rubens and Kurlbaum (1901) are replotted in Fig. 2.5 showing only 
the formulae proposed by Wien, Planck and Rayleigh. Rubens and Kurlbaum concluded 
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that Planck’s formula was superior to all the others and that it gave precise agreement with 
experiment. Rayleigh’s proposal was found to be a poor representation of the experimental 
data. Rayleigh was somewhat upset by the tone of voice in which they discussed his result. 
When his scientific papers were republished two years later he remarked on his important 
conclusion that the intensity of radiation should be proportional to temperature at low 
frequencies. He pointed out, 


‘This is what I intended to emphasise. Very shortly afterwards the anticipation above 
expressed was confirmed by the important researches of Rubens and Kurlbaum who 
operated with exceptionally long waves.’ (Rayleigh, 1902) 


Despite this success, Planck had not explained anything. The formula he derived was 
based upon essentially thermodynamic arguments guided by experiment without theoretical 
understanding at the microscopic level. The formula (2.38) was presented to the German 
Physical Society on 19 October 1900 and on 14 December 1900, he presented another 
paper entitled On the theory of the energy distribution law in the normal spectrum (Planck, 
1900c). In his memoirs, he wrote 


‘After a few weeks of the most strenuous work of my life, the darkness lifted and an 
unexpected vista began to appear.’ (Planck, 1925) 


2.6 Planck's theory of black-body radiation 


In his scientific biography, Planck wrote: 


‘On the very day when I formulated this law, I began to devote myself to the task of 
investing it with a true physical meaning. This quest automatically led me to study the 
interrelation of entropy and probability — in other words, to pursue the line of thought 
inaugurated by Boltzmann.’ (Planck, 1950) 


Planck recognised that the way forward involved adopting a point of view which he had 
rejected in essentially all his previous work. He was not a specialist in statistical physics and 
we will find that his analysis did not follow the precepts of classical statistical mechanics. 
Despite the basic flaws in his argument, he discovered the essential role which guantisation 
plays in accounting for the spectrum of black-body radiation. 

Planck’s analysis began by following Boltzmann’s procedure. There is a fixed total energy 
E to be divided among the N oscillators and energy elements € are introduced. Therefore, 
there arer = E /e energy elements to be shared among the oscillators. Rather than following 
Boltzmann’s procedure in statistical physics, however, Planck simply worked out the total 
number of ways in which r energy elements can be distributed over the N oscillators. The 


answer is!° 


(N+r—1)! 


r\(r —1)! Zu 
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2.6 Planck’ theory of black-body radiation 


Now Planck made the crucial step in his argument. He defined (2.41) to be the probability 
p to be used in Boltzmann’s expression for the entropy 


S=Chp. 


Let us see where this leads. N and r are very large indeed and so we can use Stirling’s 
approximation for n!: 


nee Cy (: - +) ; (2.42) 


We need to take the logarithm of (2.41) and so we can use an even simpler approximation, 
n! ~ n”, and so 


_(N+r-Di (N+ (N +o 








P= NoD FIND NY (2.43) 
Therefore, 
S= CiN +r)n(N +r)—rinr—NinN, 
E NE (2.44) 


r=— ; 
E E 


where E is the average energy of the oscillators. Therefore, 


S=C\N(1+—])m| Nv [14+ In Ninn}. (2.45) 
E E E € 


The average entropy per oscillator S is therefore 


Joi (+ Z)in(1 2) 22), (2.46) 
N E E E E 


But this looks rather familiar. The relation (2.46) is exactly the expression (2.39) for the 
entropy ofan oscillator which Planck had derived to account for the spectrum of black-body 


radiation, namely: 
S= 1+ 2 In{ 1+ 2 zy z 
== bp Vb) BB]? 


with the requirement b œ v. Thus, the energy elements ¢ must be proportional to frequency 
and Planck wrote this requirement in the familiar form 





e=hv, (2.47) 


where h is rightly known as Plancks constant. This is the origin of the concept of quan- 
tisation. According to the procedures of classical statistical mechanics, we ought now to 
allow £ — 0, but evidently we cannot obtain the expression for the entropy of an oscillator 
unless the energy elements do not disappear, but have finite magnitude £ = hv. Therefore, 
the expression for the energy density of black-body radiation is 


8hv’ 1 
uv) = — 5 rL 





(2.48) 


C 
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Finally, what about C? Planck pointed out that C is a universal constant relating the 
entropy to the probability. Therefore, any suitable law such as the perfect gas law which 
determines its value determines C for all processes. For example, we could use the results 
of the Joule expansion of a perfect gas, treated classically and statistically. Then, the ratio 
C = k = R/N4g, where R is the gas constant and Na is Avogadro’s number, the number of 
molecules per gram molecule.!! Hence, the Planck distribution in its final form is: 








8rhv> 1 
WO) = erg (2.49) 
Integrating, the total energy density of radiation u in the black-body spectrum is, 
” Sch [9 v? dv Skt 4 i 
“=f u(v) dv = O Í ChV/KT _ 1 = (a) T? =aT (2.50) 
where 
87> k4 Gee 
a= = 1.566 x 10" Im? Kt. 


We have recovered the Stefan-Boltzmann law for the energy density of radiation u. We can 
relate this energy density to the energy emitted per second from the surface of a black-body 
maintained at temperature T. The rate at which the energy is reradiated!? is tuc and so 








574 
1 ac 74 4 27k 4 -8 74 -2 
Den gT =oT (Sp)? = 5.67 x 10” T’Wm”“. (2.51) 
This provides the determination of the value of the Stefan—Boltzmann constant o in terms 
of fundamental constants. 

What is one to make of Planck’s argument? There are two fundamental concerns: 


(1) Planck certainly does not follow Boltzmann’s procedure for finding the equilibrium 
energy distribution of the oscillators. What he defines as a probability is not really 
a probability of anything drawn from any parent population. Planck had no illusions 
about this. In his own words: 


‘In my opinion, this stipulation basically amounts to a definition of the probability 
W; for we have absolutely no point of departure, in the assumptions which underlie 
the electromagnetic theory of radiation, for talking about such a probability with a 
definite meaning.’ ? 


Einstein repeatedly pointed out this weak point in Planck’s argument: 


“The manner in which Mr Planck uses Boltzmann’s equation is rather strange to 
me in that a probability of a state W is introduced without a physical definition of 
this quantity. If one accepts this, then Boltzmann’s equation simply has no physical 
meaning.’ (Einstein, 1912) 


(2) A second problem concerns a logical inconsistency in Planck’s analysis. On the one 
hand, the oscillators can only take energies E = re and yet a classical result (2.1) has 
been used to work out the rate of radiation of the oscillator. Implicit in that analysis 
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is the assumption that the energies of the oscillators can vary continuously rather than 
take only discrete values. 


These were major stumbling blocks and nobody quite understood the significance of 
what Planck had done. The theory did not in any sense gain immediate acceptance, there 
being no further papers by Planck or others until Einstein’s papers of 1905 and 1906. 
Nonetheless, the concept of quantisation had been introduced through the energy elements 
€ without which it is not possible to reproduce the Planck distribution. In fact, in 1906 
Einstein showed that, if Planck had followed strictly Boltzmann’s procedure, he would have 
obtained the same answer whilst maintaining the essential concept of energy quantisation 
(Sect. 3.3). 

Why did Planck find the correct expression for the radiation spectrum, despite the fact 
that the statistical procedures he used were more than a little suspect? It seems quite likely 
that Planck worked backwards. It was suggested by Rosenfeld, and endorsed by Klein on 
the basis of an article by Planck of 1943, that he started with the expression of the entropy 
of an oscillator (2.39) and worked backwards to find W from exp(S/ k). This results in the 
permutation formula on the right-hand side of (2.43), which is more or less exactly the same 
as (2.41) for large values of N and r. The expression (2.41) was a well-known formula 
in permutation theory and appears early in Boltzmann’s exposition of the fundamentals 
of statistical physics. Planck then regarded (2.46) as the definition of entropy according 
to statistical physics. If this is indeed what happened, it in no sense diminishes Planck’s 
achievement. 


2.7 Planck and ‘natural units’ 
| 


Planck appreciated that there are two fundamental constants in his theory of black-body 
radiation, Boltzmann’s constant k and the new constant h, which Planck was to refer to as 
the guantum of action. The fundamental nature of k was apparent from its appearance in 
the kinetic theory as the gas constant per molecule and as the constant C in Boltzmann’s 
relation S = kIn W. Both constants could be determined with considerable precision from 
the experimentally measured form of the spectrum of black-body radiation (2.49), and from 
the value of the Stefan—Boltzmann constant (2.50) or (2.51). Combining the known value 
of the gas constant R with his new determination of k, Planck found a value of Avogadro’s 
number, Na = 6.175 x 10° molecules per mole, by far the best estimate known at that 
time. The present adopted value is Na = 6.022 x 1073 molecules per mole. 

The electric charge carried by a gram equivalent of monovalent ions was known from 
electrolytic theory and is known as Faraday’s constant. Knowing Na precisely, Planck was 
able to derive the elementary unit of charge and found it to be e = 4.69 x 10! esu, 
corresponding to 1.56 x 107!° C. Again, Planck’s value was by far the best available at 
that time, contemporary experimental values ranging between 1.3 and 6.5 x 107!” esu. The 
present standard value is 1.602 x 107!° C. 
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Table 2.1 Planck’s system of natural units 


Time m = (Cae 10s 
Length Ip, = (Gh/c?)!/? 4x 10m 
Mass mp = (he/G)'/? 5.4 x 1078 kg = 3 x 10!’ GeV 


Equally compelling for Planck was the fact that A, in conjunction with the constant of 
gravitation G and the velocity of light c, enabled a set of ‘natural’ units to be defined in 
terms of fundamental constants (Table 2.1). Often these Planck units are written in terms of 
h = h/2z, in which case their values are 2.5 times smaller than those quoted in the table. It 
is evident that the natural units of time and length are very small indeed, while the mass is 
much greater than that of any known elementary particle. A century later, these quantities 
were to play a central role in the physics of the very early Universe. 


2.8 Planck and the physical significance of / 


It was a number of years before the truly revolutionary nature of what Planck had achieved in 
these crucial last months of 1900 was appreciated. Perhaps surprisingly, he wrote no papers 
on the subject of quantisation over the next five years. The next publication which casts 
some light on his understanding was his text Lectures on the Theory of Thermal Radiation 
of 1906 (Planck, 1906). Thomas Kuhn gives a detailed analysis of Planck’s thoughts on 
quantisation through the period 1900-1906 (Kuhn, 1978). What is clear from Kuhn’s 
analysis is that Planck undoubtedly believed that the classical laws of electromagnetism 
were applicable to the processes of the emission and absorption of radiation, despite the 
introduction of the finite energy elements in his theory of quantisation. Planck describes 
two versions of Boltzmann’s procedure in statistical physics, the first being the version 
described in Sect. 2.6, in which the energies of the oscillators take values 0, £, 2e, 3e, and 
so on. There is a second version in which the molecules were considered to lie within the 
energy ranges 0 to £, £ to 2e, 2e to 3e, and so on. This procedure leads to exactly the same 
statistical probabilities as the first version. In a subsequent passage, in which the motions 
of the oscillators are traced in phase space, he again refers to the energies of the trajectories 
corresponding to certain energy ranges U to U + AU, the AU eventually being identified 
with hv. Thus, Planck regarded quantisation as referring to the average properties of the 
oscillators. 

Planck had little to say about the nature of the quantum of action h, but he was well 
aware of its fundamental importance. In his words, 


“The thermodynamics of radiation will therefore not be brought to an entirely satisfactory 
state until the full and universal significance of the constant h is understood.’ (Kuhn, 
1978) 
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Planck spent many years trying to reconcile his theory with classical physics, but he 
failed to find any physical significance for h beyond its appearance in the radiation formula. 
In his words: 


‘My futile attempts to fit the elementary quantum of action somehow into the classical 
theory continued for a number of years and they cost me a great deal of effort. Many of 
my colleagues saw in this something bordering on tragedy. But I feel differently about it. 
For the thorough enlightenment I thus received was all the more valuable. I now knew for 
a fact that the elementary quantum of action played a far more significant part in physics 
than I had originally been inclined to suspect and this recognition made me see clearly the 
need for the introduction of totally new methods of analysis and reasoning in the treatment 
of atomic problems.’ (Planck, 1950) 


It was not until after about 1908 that Planck fully appreciated the quite fundamental nature 
of quantisation, which has no counterpart in classical physics. His original view was that 
the introduction of energy elements was 


‘a purely formal assumption and I really did not give it much thought except that no matter 
what the cost, I must bring about a positive result.’ (Planck, 193 1a) 


Later in the same letter, he writes: 


‘It was clear to me that classical physics could offer no solution to this problem and would 
have meant that all energy would eventually transfer from matter into radiation. In order 
to prevent this, a new constant is required to assure that energy does not disintegrate. 
But the only way to recognise how this can be done is to start from a definite point of 
view. This approach was opened to me by maintaining the two laws of thermodynamics. 
The two laws, it seems to me, must be upheld under all circumstances. For the rest, I was 
ready to sacrifice every one of my previous convictions about physical laws.’ 
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48 


The next great steps were taken by Albert Einstein and it is no exaggeration to state that 
he was the first person to appreciate the full significance of quantisation and the reality of 
quanta. He showed that these are fundamental aspects of all physical phenomena, rather 
than just a ‘formal device’ for accounting for the Planck distribution. From 1905 onwards, 
he never deviated from his belief in the reality of quanta — it was some considerable time 
before the great figures of the day conceded that Einstein was indeed correct. 

Einstein completed what we would now call his undergraduate studies in August 1900. 
Between 1902 and 1904, he wrote three papers on the foundations of Boltzmann’s statistical 
mechanics. In 1905, Einstein was 26 and employed as ‘technical expert, third class’ at the 
Swiss patent office in Bern. In that year, he completed his doctoral dissertation on A new 
determination of molecular dimensions, which he presented to the University of Zurich on 
20 July 1905. In the same year, he published three papers which are among the greatest 
classics in the literature of physics.! Any one of them would have ensured that his name 
remained a permanent fixture in the scientific literature. These papers are: 


(1) On a heuristic point of view concerning the production and transformation of light 
(Einstein, 1905a); 

(2) On the motion of small particles suspended in stationary liquids required by the 
molecular-kinetic theory of heat (Einstein, 1905b); 

(3) On the electrodynamics of moving bodies (Einstein, 1905c). 


The third paper is Einstein’s paper on the special theory of relativity which was described 
briefly in Sect. 1.5. The second paper confirmed the correctness of the kinetic theory while 
the first introduces the concept of light quanta to describe the spectrum of black-body 
radiation. Einstein confessed that he was no mathematician and the mathematics needed 
to understand his papers of 1905 is no more than is taught in the first two years of an 
undergraduate physics course. His genius lay in his extraordinary physical intuition, which 
enabled him to see deeper into physical problems than his contemporaries. The three great 
papers were not the result of a sudden burst of creativity, but the product of deep pondering 
about the fundamental problems of physics over almost a decade. His deliberations on 
all three topics suddenly came to fruition almost simultaneously in 1905. Despite their 
obvious differences, the three papers have a striking commonality of approach. In each 
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3.2 Einstein on Brownian motion 


case, Einstein stands back from the specific problem at hand and studies the underlying 
physical principles. 


3.2 Einstein on Brownian motion 
| 


The second paper is more familiarly known by the title of a subsequent paper published in 
1906 entitled On the theory of Brownian motion (Einstein, 1906a) and is a reworking of 
some of the results of his doctoral dissertation. Brownian motion is the irregular motion 
of microscopic particles in fluids and had been studied in detail in 1828 by the botanist 
Robert Brown, who had noted the ubiquity of the phenomenon. The motion results from 
the statistical effect of very large numbers of collisions between molecules of the fluid and 
the microscopic particles. Although each impact is very small, the net result of a very large 
number of them randomly colliding with the particle is a ‘drunken man’s walk’. Einstein 
was not certain about the applicability of his analysis to Brownian motion, writing in the 
introduction to his paper: 


‘It is possible that the movements to be discussed here are identical with the so-called 
‘Brownian molecular motion’; however, the information available to me regarding the 
latter is so lacking in precision that I can form no judgment in the matter.’ 


In his autobiographical notes, he stated that he wrote the paper ‘without knowing that 
observations concerning Brownian motion were already long familiar’ (Einstein, 1979). 

Einstein begins with Stokes’ formula for the force acting on a sphere of radius a moving 
at speed v through a medium of kinematic viscosity v, F = 6x v a v, where a is the radius of 
the sphere and v the coefficient of kinematic viscosity of the fluid. By considering the one- 
dimensional diffusion of the particles in the steady state, he found the diffusion coefficient 
for the particles in the medium, D = kT /6z va, and from this the one-dimensional distance 
the particle diffuses (A?) = 2Dr in time t. The result is his famous formula for the mean 
squared distance travelled by the particle in time ¢ in one dimension, 


2 kTt 
CS = ee 
mva 


(3.1) 


where T is the temperature and k Boltzmann’s constant. Crucially, Einstein had discov- 
ered the relation between the molecular properties of fluids and the observed diffusion of 
macroscopic particles. In his estimates of the magnitude of the effect for particles 1 um 
in diameter, he needed a value for Avogadro’s number Na and he used the values which 
Planck and he had found from their studies of the spectrum of black-body radiation (see 
Sect. 3.3 below). He predicted that such particles would diffuse about 6 um in one minute. 
In the last paragraph of the paper, Einstein states: 


‘Let us hope that a researcher will soon succeed in solving the problem presented here, 
which is so important for the theory of heat!’ 
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Three tracings of the motion of colloidal particles of radius 0.53 tum, as seen under the microscope. Successive 
positions every 30 seconds are joined by straight line segments, the spacings of the grid lines being 3.2 um. This 
drawing is based upon diagrams in Perrin’s paper of 1909 (Perrin, 1909). 


Precise observations of Brownian motion were difficult at that time, but in 1908, Jean 
Perrin (1909) carried out a meticulous series of brilliant experiments which confirmed in 
detail all Einstein’s predictions (Fig. 3.1). This work convinced everyone, even the sceptics, 
of the reality of molecules. In Perrin’s words, 


‘I think it is impossible that a mind free from all preconception can reflect upon the 
extreme diversity of the phenomena which thus converge to the same result without 
experiencing a strong impression, and I think it will henceforth be difficult to defend by 
rational arguments a hostile attitude to molecular hypotheses.’ (Perrin, 1910) 


Einstein was well aware of the importance of this calculation for the theory of heat — the 
agitational motion of the particles observed in Brownian motion is heat, the macroscopic 
particles reflecting the motion of the molecules on the microscopic scale. 


3.3 Ona heuristic viewpoint concerning the production and 


transformation of light (Einstein 1905a) 
ÁÁÁÁ) 


The first paper listed in Sect. 3.1 is often referred to as Einstein’s paper on the photoelectric 
effect but that scarcely does justice to its profundity. In a letter to his friend Conrad Habicht 
in May 1905, Einstein wrote: 


‘I promise you four papers . . . the first of which I could send you soon, since I will soon 
receive the free reprints. The paper deals with radiation and the energetic properties of 
light and is very revolutionary, . . . ° (Einstein, 1993) 
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3.3 Ona heuristic viewpoint 


Here are the opening paragraphs of Einstein’s great paper. They demand attention, like the 
opening of a great symphony. 


‘There is a profound formal difference between the theoretical ideas which physicists 
have formed concerning gases and other ponderable bodies and Maxwell’s theory of 
electromagnetic processes in so-called empty space. Thus, while we consider the state 
of a body to be completely defined by the positions and velocities of a very large but 
finite number of atoms and electrons, we use continuous three-dimensional functions to 
determine the electromagnetic state existing within some region, so that a finite num- 
ber of dimensions is not sufficient to determine the electromagnetic state of the region 
completely... 

The undulatory theory of light, which operates with continuous three-dimensional 
functions, applies extremely well to the explanation of purely optical phenomena and will 
probably never be replaced by any other theory. However, it should be kept in mind that 
optical observations refer to values averaged over time and not to instantaneous values. 
Despite the complete experimental verification of the theory of diffraction, reflection, 
refraction, dispersion and so on, it is conceivable that a theory of light operating with 
continuous three-dimensional functions will lead to conflicts with experience if it is 
applied to the phenomena of light generation and conversion.’ (Einstein, 1905b) 


In other words, there may well be circumstances under which Maxwell’s theory of the 
electromagnetic field cannot explain all electromagnetic phenomena and Einstein specifi- 
cally gives as examples the spectrum of black-body radiation, photoluminescence and the 
photoelectric effect. His proposal is that, for some purposes, it may be more appropriate to 
consider light to be 


‘discontinuously distributed in space. According to the assumption considered here, in 
the propagation of a light ray emitted from a point source, the energy is not distributed 
continuously over ever-increasing volumes of space, but consists of a finite number of 
energy quanta localised at points in space that move without dividing, and can be absorbed 
and generated only as complete units.’ 


He ends by hoping that 
‘the approach to be presented will prove of use to some researchers in their investigations.’ 


Notice how Einstein’s proposal differs from Planck’s approach. Planck found that the ‘en- 
ergy elements’ € = hv associated with the oscillators in thermal equilibrium at temperature 
T must not vanish. These oscillators are the source of the electromagnetic radiation in the 
black-body spectrum, but Planck had absolutely nothing to say about the radiation emitted 
by them. He firmly believed that the waves emitted by the oscillators were the classical 
electromagnetic waves of Maxwell. In contrast, Einstein proposes that the radiation field 
itself should be quantised. 

After the introduction, Einstein states Planck’s formula (2.26) relating the average energy 
of an oscillator to the energy density of black-body radiation in thermodynamic equilibrium 
but does not hesitate to set the average energy of the oscillator E = kT according to the 
kinetic theory. He then writes the total energy in the black-body spectrum in the provocative 
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form 





= 8rkT f° 
total energy density = f u(v)dv = 2 f vdv = œ. (3.2) 
0 c 0 


This is exactly the problem which had been pointed out by Rayleigh in 1900 and which led to 
his arbitrary introduction of the exponential factor to prevent the spectrum diverging at high 
frequencies (see Sect. 2.5). This phenomenon was later called the ultraviolet catastrophe 
by Paul Ehrenfest. 

Einstein next goes on to show that, despite the high frequency divergence of the expression 
(3.2), it is a very good description of the black-body radiation spectrum at low frequencies 
and high temperatures, and hence the value of Boltzmann’s constant k can be derived from 
that part of the spectrum alone. Einstein’s value of k agreed precisely with Planck’s estimate, 
which Einstein interpreted as meaning that Planck’s estimate was independent of the details 
of the theory he had developed to account for the black-body spectrum. 

We have already emphasised the central role which entropy plays in the thermodynamics 
of radiation. Einstein now derives a suitable form for the entropy of the spectrum of black- 
body radiation using only thermodynamics and the observed form of the radiation spectrum. 
Entropies are additive, and since, in thermal equilibrium, we may consider the radiation of 
different wavelengths to be independent, we can write the entropy of the radiation enclosed 
in volume V as 


S= vf pluv), v]dv. (3.3) 
0 


The function & is the entropy of the radiation per unit frequency interval per unit volume. 
The aim of the calculation is to find an expression for the function ¢ in terms of the spectral 
energy density u(v) and frequency v. No other quantities besides the temperature 7 can 
be involved in the expression for the equilibrium spectrum, as was shown by Kirchhoff 
(see Sect. 1.6). The problem had already been solved by Wien but Einstein gives an elegant 
proof of the result,? 


əðəp 1 
— =, 3.4 
ou T aa 
resulting in a pleasant symmetry between the relations: 
[o0] oo 
s=| odv B= f u(v)dv, 
£ o (3.5) 
ds 1 dp l 
dE T ðu T` 


Einstein now uses (3.5) to work out the entropy of black-body radiation. Rather than use 
Planck’s formula, he uses Wien’s formula since, although it is not correct for low frequencies 
and high temperatures, it is the correct law in the region where the classical theory breaks 
down and so the analysis of this part of the spectrum is likely to give insight into where the 
classical calculation goes wrong. 
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3.3 Ona heuristic viewpoint 


First, Einstein writes down the form of Wien’s law as derived from experiment. In the 
notation of (2.29), 


3 





Sta v 
u(v) = c3 evt‘ (3.6) 
Taking logarithms, we can therefore find an expression for 1/7, 
1 1, 8ravV a 
_ TAV (0 3.7) 


= In = : 
T pv ču) ðu 


The expression for ¢ is found by integration: 





peo ER (3.8) 

— n P A 
Bv 8nrav’ 

Now, consider the energy density of radiation in the spectral range v to v + Av which has 

energy £ = Vu Av, where V is the volume. The entropy associated with this radiation is 


S=VoA a ere 1 3.9) 
== v= n è . 
Bv 8mav2V Av 


Suppose the volume changes from Vo to V, while the total energy remains constant. Then, 
the entropy change is 





S—S=— mV/s). (3.10) 
Bv 


But this formula looks familiar. Einstein shows that this entropy change is exactly the 
same as that found in the Joule expansion of a perfect gas according to elementary statistical 
mechanics. Boltzmann’s relation S = k In W can be used to work out the entropy difference 
S — So between the initial and final states, S — Sọ = kln W/ Wọ, where the Ws are the 
probabilities of these states. In the initial state, the system has volume Vo and the particles 
move randomly throughout this volume. The probability that a single particle occupies a 
smaller volume V is V/ Vo and hence the probability that all N end up in this volume V is 
(V / Vo)“. Therefore, the entropy difference for a gas of N particles is 


S — So = kN In(V/V%). (3.11) 


Einstein notes that (3.10) and (3.11) are formally identical. He immediately concludes 
that the radiation behaves thermodynamically as if it consisted of discrete particles, their 
number N being equal to £/ kpv. In Einstein’s own words, 


“Monochromatic radiation of low density (within the limits of validity of Wien’s radiation 
formula) behaves thermodynamically as though it consisted of a number of independent 
energy quanta of magnitude kv. 


Rewriting this result in Planck’s notation, since 8 = h/k, the energy of each quantum 
is hv. 
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Einstein then works out the average energy of the quanta according to Wien’s formula for 
black-body radiation. The energy in the frequency interval v to v + dv is £ and the number 
of quanta is e/kßv. Therefore, the average energy is 

pn e F/T dy _ Jo” v3 e=Pv/T dy En 37 _ TE 
Jy (8ra/c’)(v?/kBv)e-Pv/T dv Jo wer dv B 
(3.12) 
The average energy of the quanta is closely related to the mean kinetic energy per particle 
in the black-body enclosure, 3kT ; 

So far, Einstein has stated that the radiation ‘behaved as though’ it consisted of a number 
of independent particles. Is this just another ‘formal device’? The last sentence of Sect. 6 
of his paper leaves the reader in no doubt: 


E= 





‘the next obvious step is to investigate whether the laws of emission and transformation 
of light are also of such a nature that they can be interpreted or explained by considering 
light to consist of such energy quanta.’ 


Einstein considers three phenomena which cannot be explained by classical electromag- 
netic theory. 


1. Stokes’ rule is the observation that the frequency of photoluminescent emission is less 
than the frequency of the incident light. This is explained as a consequence of the 
conservation of energy. If the incoming quanta each have energy Avı, the re-emitted 
quanta can at most have this energy. If some of the energy of the quanta is absorbed by 
the material before re-emission, the emitted quanta of energy Av must have hv < hvi. 

2. The photoelectric effect. This is the most famous result of the paper because Einstein 
made a definite quantitative prediction on the basis of the theory expounded above. 
Ironically, the photoelectric effect had been discovered by Hertz in 1887 in the same 
experiments which fully validated Maxwell’s equations. Perhaps the most remarkable 
feature of the effect was Lénárd’s discovery that the energies of the electrons emitted 
from the metal surface are independent of the intensity of the incident radiation (Lénárd, 
1902). 

Einstein’s proposal provided an immediate solution to this problem. Radiation of a 
given frequency consists of quanta of the same energy hv. If one of these is absorbed 
by the material, the electron may receive sufficient energy to remove it from the surface 
against the forces which bind it to the material. If the intensity of the light is increased, 
more electrons are ejected, but their energies remain unchanged. Einstein wrote this 
result in the following form. The maximum kinetic energy which the ejected electron 
can have, E\, is 


E=hv-W, (3.13) 


where W is the amount of work necessary to remove the electron from the surface of 
the material, the work function of the material. Experiments to estimate the magnitude 
ofthe work function involve placing the photocathode in an opposing potential so that, 
when the potential reaches some value V, the ejected electrons can no longer reach the 
anode and the photoelectric current falls to zero. This occurs at the potential at which 
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Ex = eV. Therefore, 
V = =v — —. (3.14) 
In Einstein’s words, 


‘If the formula derived is correct, then V must be a straight line function of the 
frequency of the incident light, when plotted in Cartesian coordinates, whose slope is 
independent of the nature of the substance investigated.’ 


Thus, the quantity h/e, the ratio of Planck’s constant to the electronic charge, can be 
found directly from the slope of this relation. Nothing was known about the dependence 
of the photoelectric effect upon the frequency of the incident radiation at that time. It 
was only in 1916 that Millikan’s meticulous experiments confirmed precisely Einstein’s 
prediction. 

3. Photoionisation of gases. The third piece of experimental evidence was the fact that 
the energy of each photon has to be greater than the ionisation potential of the gas if 
photoionisation is to take place. Einstein showed that the smallest energy quanta for 
the ionisation of air were approximately equal to the ionisation potential determined 
independently by Stark. Once again, the quantum hypothesis is in agreement with 
experiment. 


This is the work described in Einstein’s Nobel Prize citation of 1921. 


3.4 The quantum theory of solids 
Sa es eae | 


In 1905, Einstein was not at all clear that Planck and he were actually describing the same 
phenomenon. In 1906, however, he showed that the two approaches were, in fact, the same 
(Einstein, 1906b). Then, later in the same year, he came to the same conclusion by a different 
argument, and went on to extend the idea of quantisation to solids (Einstein, 1906c). 

In the first of these papers, Einstein asserts that he and Planck are actually describing the 
same phenomena of quantisation. 


‘At that time [1905] it seemed to me as though Planck’s theory of radiation formed a 
contrast to my work in a certain respect. New considerations which are given in the 
first section of this paper demonstrate to me, however, that the theoretical foundation on 
which Planck’s radiation theory rests differs from the foundation that would result from 
Maxwell’s theory and the electron theory and indeed differs exactly in that Planck’s theory 
implicitly makes use of the hypothesis of light quanta just mentioned.’ (Einstein, 1906b) 


These arguments are developed further in the second paper of 1906. Einstein demon- 
strated that, if Planck had followed Boltzmann’s procedures, he would have ended up with 
the Boltzmann expression for the probability that a state of energy E = re is occupied, 
even although the limit € — 0 is not taken: 


p(E) x e7’. 
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It is assumed that the energy of the oscillator is quantised in units of £. Thus, if there are 
Np oscillators in the ground state, the number in the r = 1 state is Ny e~*/"", in the r = 2 
state Noe~2*/*", and so on. Therefore, the average energy of the oscillator is 
— N x0+E.NMetT + 2eNy ec it... 
E 
No + No en e/kT + No e72e/kT ee 


Noe ee/kT [1 + 2(No 78/47) + 3(Moe FTP +...] 














= f 3.15 
No [1 + em/kT + (ee / KT +... ] en 
We recall the following series: 
ist dr el he, (3.16) 
1—-x (1-x) 
and hence the mean energy of the oscillator is 
a —e/kT 
Bae Er G.17) 


1 — e-e/kT ee/kT — 1 . 


Thus, using the proper Boltzmann procedure, Planck’s relation for the mean energy of the 
oscillator is recovered, provided the energy element £ does not vanish. Einstein’s approach 
indicates clearly the origin of the departure from the classical result. The mean energy 
E = KT is recovered in the classical limit e — 0 from (3.17). Notice that, by allowing 
€ — 0, the averaging takes place over a continuum of energies which the oscillator might 
take. In that limit, equal volumes of phase space are given equal weights in the averaging 
process, and this is the origin of the classical equipartition theorem. Einstein shows that 
Planck’s formula requires this assumption to be wrong. Rather, only those volumes of phase 
space with energies 0, £, 2e, 3e ... should have non-zero weights and these should all be 
equal. 
Einstein then relates this result directly to his previous paper on light quanta: 


“we must assume that for ions which can vibrate at a definite frequency and which make 
possible the exchange of energy between radiation and matter, the manifold of possible 
states must be narrower than it is for the bodies in our direct experience. We must in fact 
assume that the mechanism of energy transfer is such that the energy can assume only the 
values 0, £, 2e, 3¢...’ (Einstein, 1906c) 


But this is only the beginning of the paper. Much more is to follow — Einstein puts it 
beautifully. 


‘I now believe that we should not be satisfied with the result. For the following question 
forces itself upon us. If the elementary oscillators that are used in the theory of the energy 
exchange between radiation and matter cannot be interpreted in the sense of the present 
kinetic molecular theory, must we not also modify the theory for the other oscillators 
that are used in the molecular theory of heat? There is no doubt about the answer, in my 
opinion. If Planck’s theory of radiation strikes to the heart of the matter, then we must also 
expect to find contradictions between the present kinetic molecular theory and experiment 
in other areas of the theory of heat, contradictions that can be resolved by the route just 
traced. In my opinion, this is actually the case, as I try to show in what follows.’ (Einstein, 
1906c) 
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The problem discussed by Einstein concerns the heat capacities of solids. According to the 
Dulong and Petit law, the heat capacity per mole of a solid is 3R. This result can be derived 
simply from the equipartition theorem. The model of the solid consists of N, atoms per 
mole and it is supposed that they can all vibrate in the three independent directions, x, y, z. 
According to the equipartition theorem, the internal energy per mole of the solid should 
therefore be 3NakT, since each independent mode of vibration is awarded an energy kT. 
The heat capacity per mole follows directly by differentiation: C = dU/dT = 3Nak = 3R. 

It was known that some materials do not obey the Dulong and Petit law in that they have 
significantly smaller heat capacities than 3 R — this was particularly true for light elements 
such as carbon, boron and silicon. In addition, by 1900, it was known that the heat capacities 
of some elements change rapidly with temperature and only attain the value 3R at high 
temperatures. 

The problem is readily solved if Einstein’s quantum hypothesis is adopted. For oscillators, 
the classical formula for the average energy of the oscillator kT should be replaced by the 
quantum formula 


hv 


E= ehv/kT L] ' 


Now atoms are complicated systems, but let us suppose for simplicity that, for a particular 
material, they all vibrate at the same frequency, the Einstein frequency vg, and that these 
vibrations are independent. Since each atom has three independent modes of vibration, the 
internal energy is 





hvg 
U = 3Na et LT (3.18) 
and the heat capacity is 

dU h hvg 

= ve/kT —2 hve/kT 
ar AE NE Nez 

hvg 2 ehve/kT 
Bi (=) (ehve/kT — 1)2 ` (3.19) 


Einstein compared the experimentally determined variation ofthe heat capacity of diamond 
with his formula with the results shown in Fig. 3.2. The decrease in the heat capacity at low 
temperatures is apparent, although the experimental points lie slightly above the predicted 
relation at low temperatures. 

We can now understand why light elements have smaller heat capacities than the heavier 
elements. Presumably, the lighter elements have higher vibrational frequencies than heavier 
elements and hence, at a given temperature, vg/ T is larger and the heat capacity is smaller. 
To account for the experimental data shown in Fig. 3.2, the frequency vg must lie in the 
infrared waveband. As a result, all vibrations at higher frequencies make only a vanishingly 
small contribution to the heat capacity. As expected, there is strong absorption at infrared 
wavelengths corresponding to frequencies v % vg. Einstein compared his estimates of vg 
with the strong absorption features observed in a number of materials and found remarkable 
agreement, granted the simplicity of the model. 
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The variation of the heat capacity of diamond with temperature compared with the prediction of Einstein’s quantum 
theory. The abscissa is 7/O¢, where KOg = hve, and the ordinate the molar heat capacity in calories mole=". This 
diagram appears in Einstein’s paper of 1906 and uses the results of Heinrich Weber which were listed in the tables of 
Landolt and Bornstein (Einstein, 1906c). 


The most important prediction of the theory was that the heat capacities of all solids 
should decrease to zero at low temperatures, as indicated in Fig.3.2. This was of key 
importance from the point of view of furthering the acceptance of Einstein’s ideas. At about 
this time, Walther Nernst began a series of experiments to measure the heat capacities of 
solids at low temperatures. His motivation was to test his heat theorem, or the third law of 
thermodynamics, which he had developed theoretically in order to understand the nature of 
chemical equilibria. The heat theorem enabled the calculation of chemical equilibria to be 
carried out precisely and also led to the prediction that the heat capacities of all materials 
should tend to zero at low temperatures. As recounted by Frank Blatt, 


‘. . . Shortly after Einstein had assumed a junior faculty position at the University of Zurich 
[in 1909], Nernst paid the young theorist a visit so that they could discuss problems of 
common interest. The chemist George Hevesy . . . recalls that among his colleagues it was 
this visit by Nernst that raised Einstein’s reputation. He had come as an unknown man to 
Zurich. Then, Nernst came, and the people at Zurich said, “This Einstein must be a clever 
fellow, if the great Nernst comes so far from Berlin to Zurich to talk to him.” (Blatt, 
1992) 


3.5 Debye’s theory of specific heats 
eA]; 


Einstein took little interest in the heat capacities of solids after 1907 but his ideas on 
quantisation were taken significantly further by Pieter Debye in an important paper of 1912 
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3.5 Debye's theory of specific heats 


(Debye, 1912). Einstein was well aware of the fact that the assumption that the atoms of a 
solid vibrate independently was a crude approximation. Debye took the opposite approach 
of returning to a continuum picture, almost identical to that developed by Rayleigh in his 
treatment of the spectrum of black-body radiation (Sect. 2.3.4). Debye realised that the 
collective modes of vibration of the solid could be represented by the complete set of 
normal modes which follows from fitting waves into a box, as described by Rayleigh. Each 
independent mode of oscillation of a solid as a whole should be awarded an energy 


i _ 3.20 

~ exp(iw/kT)— 1” Gey) 
according to Einstein’s prescription, where w is the angular frequency of vibration of the 
mode. The number of modes N had been evaluated by Rayleigh according to the procedure 
described in Sect. 2.3.4 and was shown to be 

La 

where c, is the speed of propagation of the waves in the material. Just as in the case of 
electromagnetic radiation, we need to determine the number of independent polarisation 
states for the wave modes. In this case, there are two transverse modes and one longitudinal 
mode, corresponding to the independent directions in which the material can be stressed by 
the wave, and so in total there are 3 (VV modes, each of which is awarded the energy (3.20). 
Debye makes the assumption that these modes have the same speed of propagation and that 
it is independent of the frequency of the modes. Therefore, the total internal energy of the 
material is 


Wmax ho 
A f exp(iw/kT) — 1 a 


3 (kTL\® (mu x3 
En dx, 3.22 
27? ( ħcs ) | e —1 oe 
where x =hw/kT. 


The problem is now to determine the value of Xmax. Debye introduced the idea that 
there must be a limit to the total number of modes in which energy could be stored. In 
the high temperature limit, he argued that the total energy should not exceed that given 
by the classical equipartition theorem, namely 3NKT. Since each mode of oscillation has 
energy AT in this limit, there should be a maximum of 3N modes in which energy is 
stored. Therefore, recalling that there are 3\/ modes, Debye’s condition can be found by 


integrating (3.21) 
@max Wmax L? 2 
3N =3 f dN =3 f dv, 
0 0 27 cz 


Wmax = 13 Cy 








(3.23) 


It is conventional to write Xmax = Awmax/kT = 0p/T, where Op is known as the Debye 
temperature. Therefore, the expression for the total internal energy of the material (3.22) 
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can be rewritten 





T 3 Op /T 3 
U=9RT (>) f X a, (3.24) 
0 


D ex — 1 


for one mole of the material. This is the famous expression derived by Debye for the internal 
energy per mole of the solid. 

To find the heat capacity, it is simplest to consider an infinitesimal increment dU associ- 
ated with a single frequency w and then integrate over x as before, 


dU TN? pelt xt 
C = — = 9R | — — dr. 3.25 
aT (5) | GEST; 2 


This integral cannot be written in closed form, but it provides a better fit to the data on 
the heat capacity of solids than Einstein’s expression (3.19). This is particularly true for 
the data at low temperatures. If T « 9p, the upper limit to the integral in (3.25) can be 
set to infinity, and then the integral has the value 474/15. Therefore, at low temperatures 
T < Op, the heat capacity depends upon temperature as 


dU 1274 fT Ye 
(a r( ) (3.26) 
On 





Rather than decreasing exponentially at low temperatures, the heat capacity varies as 7°. 
There is a simple interpretation of @max in terms of wave propagation in the solid. From 
(3.23), the maximum frequency is 


ZB 7n,\'3 
Vmax = (=) (=) Cs, (3.27) 


for one mole of the solid, where Na is Avogadro’s number. But L/N, A 3 is Just the typical 
interatomic spacing, a. Therefore, (3.27) states that Vmax © cs/a, that is, the minimum 
wavelength of the waves Amin = Cs/Vmax % a. This makes a great deal of sense physically. 
On scales less than the interatomic spacing a, the concept of collective vibration of the 
atoms of the material ceases to have any meaning. 


3.6 Fluctuations of particles and waves - Einstein (1909) 
E) 


Einstein’s startling new ideas on light quanta did not gain immediate acceptance by the 
scientific community at large. Most of the major figures in physics rejected the idea that 
light could be considered to be made up of discrete quanta. In a letter to Einstein of 1907, 
Planck wrote: 


‘I look for the significance of the elementary quantum of action (light quantum) not in 
vacuo but rather at points of absorption and emission and assume that processes in vacuo 
are accurately described by Maxwell’s equations. At least, I do not yet find a compelling 
reason for giving up this assumption which for the time being seems to be the simplest.’ 
(Planck, 1907) 
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3.6 Fluctuations of particles and waves 


Planck continued to reject the light quantum hypothesis as late as 1913. In 1909, Lorentz, 
who was generally regarded as the leading theoretical physicist in Europe, and whom 
Einstein held in the highest esteem, wrote: 


“While I no longer doubt that the correct radiation formula can only be reached by 
way of Planck’s hypothesis of energy elements, I consider it highly unlikely that these 
energy elements should be considered as light quanta which maintain their identity during 
propagation.’ (Lorentz, 1909) 


Einstein never deviated from his conviction concerning the reality of quanta and con- 
tinued to find other ways in which the experimental features of black-body radiation lead 
inevitably to the conclusion that light consists of quanta. In one of his most impressive pa- 
pers written in 1909, he showed how fluctuations in the intensity of the black-body radiation 
spectrum provide further evidence for the quantum nature of light (Einstein, 1909). Notice 
how the theme of fluctuations and stochastic processes keeps reappearing in Einstein’s 
physical understanding. I have given a detailed treatment of the theory of fluctuations in 
the number densities of particles and waves in Sects. 15.2.1 and 15.2.2 of TCP2. Here the 
results of these calculations are summarised. 


3.6.1 Particles in a box 


A box is divided into N equal cells and a large number of particles is distributed randomly 
among them. If is very large, the mean number of particles in each cell is roughly the 
same, but there is a real scatter about the mean value because of statistical fluctuations. 
Suppose p is the probability of a single cell being occupied and q the probability that it 
is not occupied so that p + q = 1. The probability distribution can be worked out exactly 
using permutation theory and then converted to a continuous distribution. The result is a 
normal, or Gaussian, distribution p(x) dx, which can be written 





1 2 
PO) de = nm exp ( = ;) dx (3.28) 


where o? = npg is the variance; x is measured with respect to the mean value np. If the box 
is divided into N cells, the probability of a particle being in a single cell in one experiment 
is p = 1/N, q = (1 — 1/N). The total number of particles is n. Therefore, the average 
number of particles per sub-box is n/N and the variance about this mean value, that is, the 
mean squared statistical fluctuation about the mean, is 


2 n 1 
o= N (: — x) . (3.29) 


If N is large, o? = n/N and is the average number of particles in each cell, that is, 
o = (n/N)!/?,. This is the well-known result that, for large values of N, the mean is equal 
to the variance. This is the origin of the useful rule that the fractional fluctuation about the 
average value is 1/M'/* where M is the number of discrete objects counted. 
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3.6.2 Fluctuations of randomly superposed waves 


The random superposition of waves is different in important ways. Suppose the electric field 
E at some point in space is the random superposition of the electric fields from N sources, 
where N is very large. For simplicity, we consider only propagation in the z-direction and 
only one of the two linear polarisations of the waves, E or Ey. We also assume that the 
frequencies v and the amplitudes é of all the waves are the same, the only difference being 
their random phases. Then, the quantity E* Ey = | E|? is proportional to the Poynting vector 
flux density in the z-direction for the Ey component and so is proportional to the energy 
density of the radiation, where E* is the complex conjugate of Ex. Since the phases of the 
waves are random, 


(EEE) = NE? X ux. (3.30) 


This is a familiar result. For incoherent radiation, meaning for waves with random phases, 
the total energy density is equal to the sum of the energies in all the waves. 

A similar calculation can be carried out for the fluctuations in the average energy density 
of the waves. We work out the quantity ((E* £,)*) with respect to the mean value (3.30). 
Recalling that (An?) = (n?) — (n)?, 


Au? & (E1 E,}) — (E$ E}. (3.31) 
As shown in Sect. 15.2.2 of TCP2, 


Aw =u, (3.32) 


x 


that is, the fluctuations in the energy density are of the same magnitude as the energy density 
of the radiation field itself. Despite the fact that the radiation measured by a detector is a 
superposition of a large number of waves with random phases, the fluctuations in the fields 
are as large as the magnitude of the total intensity. The physical meaning of this calculation 
is clear. Every pair of waves of frequency v interferes to produce fluctuations in intensity 
of the radiation Au ~ u. Notice that this analysis refers to waves of random phase & and of 
a particular angular frequency w, that is, what we would refer to as waves corresponding to 
a single mode. 


3.6.3 Fluctuations in black-body radiation 
Einstein begins his paper of 1909 by reversing Boltzmann’s relation between entropy and 
probability: 
W = ek. (3.33) 


Consider the radiation in the frequency interval v to v + dv. As before, we write € = 
Vu(v) dv. Now divide the volume into a large number of cells and suppose that Ag; is the 
fluctuation in the ith cell. Then the entropy of this cell is 


as | (2s A 
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But, averaging over all cells, we know that there is no net fluctuation, )°, Ae; = 0, and 
therefore 








925 
_ = 1 2 

S=),S=S(0)+4 (=) (As). (3.35) 

Therefore, using (3.33), the probability distribution of the fluctuations is 

1/as (As) 
W cxexp| 5 (=) 7 | ; (3.36) 
This is the sum of a set of normal distributions which, for any individual cell, can be written 
1 (Ag; k 

W; x exp [-5° ei) | where o? = ————_—_.. (3.37) 

2 o? (925 /aU 2) 


Notice that we have obtained a physical interpretation for the second derivative of the 
entropy with respect to energy, which had played a prominent part in Planck’s original 
analysis (see equation (2.28)). 

Let us now find o? for a black-body spectrum: 





Sırhv? 1 
u(v) = 3 ghv/kT _] $ (3.38) 
Inverting (3.38), 
Le inl eg (3.39) 
== n a e 
T hv Cu 


We now express this result in terms of the total energy in the cavity in the frequency 
interval v to v + dv, € = Vu dv. As before, dS/dU = 1/T and we may identify e with U. 

















Therefore, 
as z k n( 4 1) _ k n(A 4 i) 
de hv Cu hv Ge f 
as k 1 Sıhv’V dv 
de?  hv (8rhv>V dv er 
u 
3 
sad ie (ive + zero e) = —0?. (3.40) 
In terms of fractional fluctuations, 
o? hv 3 
2 ( pY avi) 6.41) 


Einstein noted that the two terms on the right-hand side have quite specific meanings. 
The first term originates from the Wien part of the spectrum and, if we suppose the radiation 
consists of photons, each of energy Av, it corresponds to the statement that the fractional 
fluctuation in the intensity is just 1/N/? where N is the number of photons, that is, 


AN/N =1/N!°. (3.42) 
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According to the considerations of Sect.3.6.1, this is exactly the result expected if light 
consists of discrete particles. 

Let us now look more closely at the second term. It originates from the Rayleigh- 
Jeans part of the spectrum. We ask, ‘How many independent modes are there in the box 
in the frequency range v to v + dv?’ We have already shown in Sect. 2.3.4 that there 
are 82v*V dv/c? modes (see (2.24) et seq.). We have also shown in Sect. 3.6.2 that the 
fluctuations associated with each wave mode have magnitude Ag? = &?. When we add 
together randomly all the independent modes in the frequency interval v to v + dv, we add 
their variances and hence 


(SE?) 1 3 
E? — Nmode 8mv2V dv’ 
which is exactly the same as the second term on the right-hand side of (3.41). 

Thus, the two parts of the fluctuation spectrum correspond to particle and wave statistics, 
the former corresponding to the Wien part of the spectrum and the latter to the Rayleigh— 
Jeans part. The amazing aspect of this formula for the fluctuations is that we recall that we 
add together the variances due to independent causes and the equation 


o? (= 2 ) 
=(—+ (3.43) 


e? E 8r v? y dv 








states that we should add independently the variances of the ‘wave’ and ‘particle’ fluctua- 
tions of the radiation field to find the total magnitude of the fluctuations. This remarkable 
expression was to have long-term resonances in the struggles to interpret quantum mechan- 
ics once the theory reached its definitive form in the late 1920s. 


3.7 The First Solvay Conference 
e a o T) 


Among those who were persuaded of the importance of quanta was Walther Nernst, who 
at that time was measuring the low temperature heat capacities of various materials. As 
recounted in Sect. 3.4, Nernst visited Einstein in Zurich in March 1910 and they compared 
Einstein’s theory with his recent experiments. These experiments showed that Einstein’s 
predictions of the low temperature variation of the specific heat with temperature (3.19) 
gave a good description of the experimental results. As Einstein wrote to his friend Jakob 
Laub after the visit, 


‘I consider the quantum theory certain. My predictions with respect to specific heats 
seem to be strikingly confirmed. Nernst, who has just been here, and Rubens are eagerly 
occupied with experimental tests, so that people will soon be informed about this matter.’ 
(Einstein, 1910) 


By 1911, Nernst was convinced, not only of the importance of Einstein’s results, but also of 
the theory underlying them. The outcome of the meeting with Einstein was dramatic. The 
number of papers on quanta began to increase rapidly, as Nernst popularised the results of 
the quantum theory of solids. 
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3.7 The First Solvay Conference 





The participants in the First Solvay Conference on physics, Brussels 1911 (Langevin and De Broglie, 1912). From left to 
right at the table: Nernst, Brillouin, Solvay, Lorentz, Warburg, Perrin, Wien, Sktodowska-Curie, Poincaré. From left to 
right standing: Goldschmidt, Planck, Rubens, Sommerfeld, Lindemann, de Broglie, Knudsen, Hasenöhrl, Hostelet, 
Herzen, Jeans, Rutherford, Onnes, Einstein, Langevin. 


Nernst was a friend of the wealthy Belgian industrialist Ernest Solvay and he persuaded 
him to sponsor a meeting of a select group of physicists to discuss the issues of quanta and 
radiation. The idea was first mooted in 1910, but Planck urged that the meeting should be 
postponed for a year. As he wrote, 


‘My experience leads me to the opinion that scarcely half of those you envisage as 
participants have a sufficiently lively conviction of the pressing need for reform to be 
motivated to attend the conference ... Of the entire list you name, I believe that besides 
ourselves [only] Einstein, Lorentz, W. Wien and Larmor are deeply interested in the topic.’ 
(Planck, 1910) 


By the following year, matters were very different. Debye, Haas, Hasenöhrl, Schidlof, Weiss 
and Wilson had published papers on the quantum hypothesis. 

The eighteen official participants met on 29 October 1911 in the Hotel Metropole in 
Brussels and the meeting took place between the 30th of that month and 3 November 
(Fig. 3.3). By this time, the majority of the participants took the quantum hypothesis 
seriously. Two of them were against quanta — Jeans and Poincaré. Five were initially 
neutral — Rutherford, Brillouin, Sktodowska-Curie, Perrin, Knudsen. The eleven others were 
basically pro-quanta — Lorentz (chairman), Nernst, Planck, Rubens, Sommerfeld, Wien, 
Warburg, Langevin, Einstein, Hasenöhrl and Onnes. The secretaries were Goldschmidt, 
de Broglie and Lindemann; Solvay, who hosted the conference, was there, as well as his 
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Millikan’s results on the photoelectric effect compared with the predictions of Einstein’s quantum theory (Millikan, 

1916). 


collaborators Herzen and Hostelet. The physicists who took a neutral position had done so 
because they were unfamiliar with the arguments. 

The conference had a profound effect in that it provided a forum at which all the arguments 
could be presented. In addition, all the participants wrote their lectures for publication 
beforehand and these were then discussed in detail. These discussions were recorded and 
the full proceedings published within a year of the event in the important volume La Théorie 
du rayonnement et les quanta: Rapports et discussions de la réunion tenue a Bruxelles, du 
30 octobre au 3 novembre 1911 (Langevin and De Broglie, 1912) Thus, all the important 
issues were made available to the scientific community in one volume. As a result, the next 
generation of students became fully familiar with the arguments and many of them set to 
work immediately to tackle the problems of quanta. Furthermore, these problems began to 
be appreciated beyond the central European German-speaking scientific community. 

A particularly significant convert was Poincaré who immediately tackled the issue of 
whether or not the introduction of what he called ‘discontinuities’ were essential in order 
to understand the spectrum of black-body radiation. In his detailed analysis of the problem, 
he came to the conclusion that if w(e) is the probability density of Planck’s resonators, the 
measured spectrum in the Wien region could only be accounted for if the function was a 
discontinuous function of the energy ¢ (Poincaré, 1912). It had to be zero for all values 
of e except £ = 0, hv, 2hv, 3hv.... Poincaré’s paper was so compelling that even Jeans 
conceded that he had to accept the quantum hypothesis in its entirety. 

It would be wrong, however, to believe that everyone was suddenly convinced of the 
existence of quanta. In his paper of 1916 on his famous series of experiments in which 
he verified the dependence of the photoelectric effect upon frequency (Fig. 3.4), Millikan 
stated: 
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“We are confronted however by the astonishing situation that these facts were correctly 
and exactly predicted nine years ago by a form of quantum theory which has now been 
generally abandoned.’ (Millikan, 1916) 


Millikan refers to Einstein’s ‘bold, not to say reckless, hypothesis of an electromagnetic 
light corpuscle of energy hv which flies in the face of the thoroughly established facts of 
interference.’ (Millikan, 1916). 


3.8 The end of the beginning 


The 1911 Solvay Conference is a convenient point at which to conclude the introduction 
to the history of quantum mechanics. The essential role of quanta and the fundamental 
significance of Planck’s constant h could not be ignored. At the same time, something had 
gone spectacularly wrong with classical physics, but it was far from clear what was to 
replace it. The remarkable analyses of Planck and Einstein had shown that oscillators and 
radiation are quantised, but how did it all fit together? 

The next phase from 1911 to 1924 I have designated the era of the old quantum theory 
during which many of the pieces began to fall into place and which were eventually to be 
subsumed into the full theory of quantum mechanics, but the old theory was doomed to 
failure. Nonetheless, these endeavours uncovered many of the essential features of quantum 
processes which were only to be rationalised within the context of quantum mechanics 
during the dramatic years from 1925 to 1930. 
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The Bohr model of the hydrogen atom 





Following the success of the 1911 Solvay Conference and the rapid dissemination of the 
proceedings, the emphasis of research shifted towards the understanding of the spectra 
of atoms and molecules. With the availability of precision spectroscopic techniques, the 
bewildering variety of spectral features of atoms and molecules became apparent. The 
efforts described in Sect. 1.6 indicate how regularities were found in the patterns of spectral 
lines, the culmination of these investigations being the discovery of the formula for the 
Balmer series and the various formulae to account for the principal, diffuse and sharp series 
of the lines in the spectra of sodium, potassium, magnesium, calcium and zinc. As Planck 
remarked in 1902, 


‘If the question concerning the nature of white light may thus be regarded as being solved, 
the answer to the closely related but no less important question — the question concerning 
the nature of light of the spectral lines — seems to belong among the most difficult and 
complicated problems, which have ever been posed in optics or electrodynamics.’ (Planck, 
1902) 


4.1 The Zeeman effect: Lorentz and Larmor’s interpretations 


71 


In 1862, Faraday attempted to measure the change in wavelength of spectral lines when 
the source of the lines was placed in a strong magnetic field, but failed to observe any 
positive effect (Jones, 1870). Inspired by this negative result, Pieter Zeeman repeated the 
experiment and discovered the broadening of the D lines of sodium when a sodium flame 
was placed between the poles of a strong electromagnet (Zeeman, 1896a). Zeeman used 
a high quality Rowland grating with a radius of 10 feet and 14938 lines per inch but the 
10 kG produced by the magnet was insufficient to resolve the broadened lines (Fig. 4.1). By 
the end of October 1896, Zeeman was convinced that the broadening of the spectral lines 
was a real effect, the broadening being proportional to the applied magnetic flux density, 
and his paper was presented on Saturday 31 October 1896 to the Science Section of the 
Dutch Academy of Sciences. Over that same weekend, Lorentz interpreted this result in 
terms of the splitting of spectral lines due to the motion of the ‘ions’ in the atoms in the 
magnetic field. 

Lorentz had derived the correct expression for the force acting on a charge in combined 
electric and magnetic fields in 1892, the Lorentz force, 


F=e(E+vxB), (4.1) 
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The Bohr model of the hydrogen atom 





The original electromagnet used by Zeeman in 1896 is in the Museum Boerhaave in Leiden. The experiment was 
reconstructed in 2002, on the occasion of the centennial anniversary of the Nobel Prize for Zeeman and Lorentz. 


where E is the electric field strength and B is the magnetic flux density (Lorentz, 1892b). 
Lorentz carried out the following calculation.! It is supposed that the emission line is due 
to the vibration of oscillators within the atoms of the material, each characterised by a mass 
m and a spring constant k. In the presence of a uniform magnetic field in the z-direction, 
the equations of motion of the oscillator under the influence of the Lorentz force can be 
written, 


dx dy 

dy dx 

— = —ky — eB —, 4.3 
"qe a =) 

d’z 


The solution of (4.4) is z = a cos(wọt + p) where a and p are constants and wp is the 
angular frequency of the oscillator, wo = 27 vo = ./k/m. Lorentz found the following two 
solutions for motion in the x- and y-directions: 


x = a) cos(wit + pı); (4.5) 

y=-aı sin(@ t + pi), l 
and 

x = a COS(w2t + pr); (4.6) 

y = a) sSin(@ot + p2), l 
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4.1 The Zeeman effect 


where 


B B 
o? — 2 0 = w and A + Side = w. (4.7) 
m m 


The angular frequencies w, and œ are just slightly displaced from wọ and so, for small 
departures Aw from wo, we find 


eB eB 
Aw =+— or Avy=+ ; (4.8) 
2m Anm 











This analysis made definite predictions about the polarisation ofthe broadened lines. First of 
all, inspection of (4.5) and (4.6) shows that these motions correspond to circular motion in 
opposite senses about the magnetic field direction. Therefore, as viewed along the magnetic 
field direction, the radiation of the ‘ions’ should be circularly polarised in opposite senses on 
either side of the unperturbed frequency vo. Along this direction there should be no emission 
at vo because the acceleration is along the line of sight.? On the other hand, when viewed 
perpendicular to the magnetic field direction, all three components should be observed. 
They are all linearly polarised, the central frequency component being polarised parallel 
to the field direction and the displaced components perpendicular to the field direction. 
Zeeman continued his careful measurements over the next two months and discovered that 
the polarisation properties of the broadened lines agreed with these expectations (Zeeman, 
1896b). Several months later, the splitting of the blue cadmium line into separate lines 
was observed (Zeeman, 1897). For observations along the magnetic field direction two 
components were observed, while triplets were observed for measurements perpendicular 
to it. 

Furthermore, according to (4.8), the splitting of the line depends upon the ratio e/m and 
so the broadening of the line enables this ratio to be found for the ‘ions’ responsible for the 
emission lines. Lorentz found a lower limit of 1000 for the value of e/m. relative to that 
of the hydrogen ion. This came as a surprise to Lorentz and his colleagues since the ions 
responsible for electrolytic phenomena had values of e/m similar to that of the hydrogen 
ion. The Zeeman effect thus provided a means of studying the internal structure of atoms. 

At about the same time, an alternative approach to the splitting of spectral lines was 
proposed by Joseph Larmor (Larmor, 1897). In the simplest case, consider a charged 
particle of mass m and electric charge e moving in a circular orbit. There is an electric 
current associated with this motion resulting in a magnetic moment p which is related to 
the angular momentum of the particle L by the classical relation u = (e/2m)L. Suppose 
the axis of the magnetic dipole of the orbit is at an angle 9 with respect to the magnetic 
field direction B. Then, there is a torque acting on the orbiting electron, the magnitude and 
direction of the torque being given by the vector relation T = u x B. The action of the 
torque causes the angular momentum vector to precess about the direction of the magnetic 
field in the azimuthal & direction since 


L 
r_4 


= (4.9) 
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From the geometry of the precession, |dZ| = |L| sin 8 dọ and so the angular frequency of 
precession is 
dọ |dL| 1 IT | |w||B| sind eB 
On = = = gm = å 
P dt dt |L|sin0 |L|sin@ |L| sind 2m 





(4.10) 


This is exactly the same formula as derived by Lorentz. Thus, there were two ‘ion’ pictures 
involving the magnetic field — either the splitting was associated with the effect of the 
magnetic field upon a linear oscillator or with the precessional motion of an orbiting 
charge. 

These discoveries gave significant insights into the physics of atoms, but the picture 
was soon clouded by the discoveries of Michelson (1897) and Preston (1898) that the 
spectral lines of atoms could be split into four, six or more components. These results were 
inconsistent with the Lorentz picture which became known as the normal Zeeman effect. 
The higher order splittings were referred to as the anomalous Zeeman effect, the explanation 
of which was at least 20 years in the future. 


4.2 The problems of building models of atoms 
SSS SSS SS aes 


By 1900, the existence of the electron was clearly established through the discovery of 
the Zeeman effect and the values of e/m determined from the discharge tube experiments 
of Thomson, Wiechert and Kaufmann described in Sect. 2.2.3. As Planck noted above, 
however, the problems of constructing models of atoms were formidable. Heilbron (1977) 
conveniently lists six basic questions which faced the model builders. 


e The nature of the positive charge necessary to create electrically neutral atoms. Was 
charge neutrality provided by an equal number of positively charged electrons or by 
some other distribution of positive charge? 

e The number of electrons in the atom was uncertain. From their charge-to-mass ratio, 
there could be thousands of electrons in the atom and that was perhaps not unreasonable 
in view of the large numbers of spectral lines observed in atomic spectra. Even the 
lightest elements have large numbers of spectral lines while several thousands of lines 
are observed in the spectrum of iron. 

e There is nothing in classical physics which could establish a natural length-scale for 
atoms. A clue was at hand with the introduction of Planck’s constant h in his epochal paper 
of 1900, but it was over a decade before Bohr showed how the concept of quantisation 
could be applied to determine the size of the hydrogen atom. 

« The problem of the collapse of the atom due to the radiation of electromagnetic radiation 
by the orbiting electrons was a major stumbling block for atomic theorists. As we will 
see in Sect. 4.3.2, there are ways of minimising the problem, but these were to prove 
inadequate. 

e Even if these problems could be overcome, there was still the problem of understanding 
the origin of the various formulae for the spectral lines discussed in Sect. 1.6. 
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4.3 Thomson and Rutherford 


e Finally, there remained the fundamental problem of understanding the nature of the 
oscillators responsible for the observation of spectral lines. 


These issues were to be addressed by a combination of experiment and theory over the 
following decade, the pieces of the jigsaw gradually falling into place. Two of the problems 
were to find definitive solutions through the experiments of Thomson and Rutherford. 


4.3 Thomson and Rutherford 
——>E>E>E>>E>EaEaEEEEEE——LLLAhBD9DA9999DAAh9LhDlLhapaE SS] 


4.3.1 Thomson and the numbers of electrons in atoms 


In 1906, Thomson published his analysis of three different ways of estimating the number of 
electrons in atoms (Thomson, 1906). His model of the atom involved a swarm of electrons 
within a neutralising sphere of positive charge. The first approach involved working out 
the dispersion of light of different frequencies when it passes through a gas. He applied 
the method successfully to hydrogen and found that the number of electrons had to be 
approximately equal to the atomic mass number A = 1. 

The second approach involved the scattering of X-rays by electrons. X-rays are scattered 
by the electrons in atoms by Thomson scattering, the theory of which was worked out 
by Thomson using the classical expression for the radiation of an accelerated electron 
(Thomson, 1907). It is straightforward to show that the cross-section for the scattering of a 
beam of incident radiation by an electron is 
e Sur? 


= — = 6.653 x 107” m, (4.11) 


6me5m2c4 3 





OT 


the Thomson cross-section, where re = e? [4r comec? is the classical electron radius.? In his 
model of the atom, it was assumed that the ‘corpuscles’ behave like free electrons — notice 
that the Thomson cross-section is independent of the frequency of the incident radiation. 
As Thomson states in his paper, 


‘Barkla has shown that in the case of gases the energy in the scattered radiation always 
bears, for the same gas, a constant ratio to the energy in the primary whatever be the nature 
of the rays, that is, whether they are hard or soft; and secondly, that the scattered energy 
is proportional to the mass of the gas. The first of these results is a confirmation of the 
theory, as the ratio of the energy scattered to that in the primary rays . .. is independent 
of the nature of the rays; the second result shows that the number of corpuscles per cubic 
centimetre is proportional to the mass of the gas: from this it follows that the number 
of corpuscles in an atom is proportional to the mass of the atom, that is, to the atomic 
weight.’ 


For a beam of X-rays passing through air, Barkla had measured the scattered fraction of 
the X-ray intensity to be 2.4 x 1074 cm~? and consequently that there should be about 
25 corpuscles per air molecule, roughly equal to the atomic mass number A of the air 
molecules. 
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The third approach involved the scattering of 6-rays by matter. Thomson derived the 
formula for what was called ‘multiple-scattering theory’, the energy loss due to multiple 
electrostatic interactions between the fast electron and the electrons in atoms — this process 
is also known as ionisation losses in the context of the interactions of electrons with atoms.* 
Using estimates by Rutherford for the mean free path of 6-rays, Thomson again found the 
result n ~ A. The conclusion of his paper was dramatic — most of the mass of atoms could 
not be due to the negatively charged electrons but must reside in the positive charge which 
held the atoms together. The X-ray experiments were continued by Barkla who showed 
that in fact, except for hydrogen, the number of electrons is roughly half the atomic weight 
n = A/2 (Barkla, 191 1a). 


4.3.2 The radiative and mechanical instability of atoms 


The construction of atomic models was a major industry in the early years of the twentieth 
century, particularly in England, and has been splendidly surveyed by Heilbron (1977). A 
key question was, ‘How are the electrons and the positive charge distributed inside atoms?’ 
However they are distributed, they cannot be stationary because of Earnshaw’s theorem, 
which states that any static distribution of electric charges is mechanically unstable, in 
that they either collapse or disperse to infinity under the action of electrostatic forces. The 
alternative is to place the electrons in orbits, what is often called the ‘Saturnian’ model of 
the atom, as advocated by Perrin (1901) and Nagaoka (1904a,b). Nagaoka was inspired by 
Maxwell’s model for Saturn’s rings and attempted to associate the spectral lines of atoms 
with small vibrational perturbations of the electrons about their equilibrium orbits. 

A major problem with the Saturnian pictures was the radiative instability of the electron. 
Suppose the electron has a circular orbit of radius a. Then, equating the centripetal force 
to the electrostatic force of attraction between the electron and the nucleus of charge Ze, 

Ze? mv? 


rer u (4.12) 


where |F| is the centripetal acceleration. The rate at which the electron loses energy by 


radiation is given by (2.1). The kinetic energy of the electron is E = Im.v? = imeaļř]. 


Therefore, the time it takes the electron to lose all its kinetic energy by radiation is 
E 27a? 


T = — = —. 4.13 
IdE /dt| OTC ( ) 


Taking the radius of the atom to be a = 10!" m, the time it takes the electron to lose 
all its energy is about 3 x 10~!° s. Something is profoundly wrong. As the electron loses 
energy, it moves into an orbit of smaller radius, loses energy more rapidly and spirals into 
the nucleus. 

The pioneer atom model builders were well aware of this problem. Fortunately, the 
wavelength of light A is very much greater than the size of atoms a and so the solution 
was to place the electrons in orbits such that there would be no net acceleration when 
the acceleration vectors of all the electrons in the atom are added together. This requires, 
however, that the electrons are well ordered in their orbits about the nucleus. If, for example, 
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there are two electrons in the atom, they can be placed in the same circular orbit on opposite 
sides of the nucleus and so, to first order, there is no net dipole moment as observed 
at infinity, and hence no dipole radiation. There is, however, a finite electric quadrupole 
moment and hence radiation at the level (A /a)’, relative to the intensity of dipole radiation, 
is expected. Since A/a ~ 107°, the radiation problem can be significantly relieved. By 
adding more electrons to the orbit, the quadrupole moment can be cancelled out as well 
and so, by adding sufficient electrons to each orbit, the radiation problem can be reduced 
to manageable proportions. Thus, before Thomson’s paper of 1906, the radiative instability 
could be overcome by assuming that a huge number of electrons were so disposed as to 
result in no net multipole moments of the electron distribution. Each orbit had to be densely 
populated with a well-ordered system of large numbers of electrons. This was the basis of 
Thomson’s ‘plum-pudding’ model in which the well-ordered orbits were embedded in a 
sphere of positive charge. Thomson’s result of 1906 that the numbers of electrons in atoms 
is of the same order as the atomic mass number meant that this problem could no longer be 
ignored. The radiative problem was particularly severe for hydrogen, which possesses only 
one electron. 

The other problem with the Saturnian picture of Nagaoka was its mechanical instability. 
Nagaoka had been inspired by Maxwell’s model of Saturn’s rings in which stable oscillations 
were found when rings of particles were perturbed. He attempted to associate the spectral 
lines of atoms with these oscillations. In Maxwell’s case the perturbations were stable under 
the attractive force of gravity between particles of the ring, but in the case of the repulsive 
electrostatic forces between electrons, the perturbations were unstable. This mechanical 
instability was inevitable, even if the radiative instability could be eliminated. 


4.3.3 Rutherford, «-particles and the discovery of the atomic nucleus 


The nature of -rays as electrons was quickly assimilated into the armoury of the physicist, 
but what about the nature of the «-particles? In 1902, while Rutherford was at McGill 
University in Canada, he showed that the w-particles were deflected by electric and magnetic 
fields and that their value of e/m was roughly that of the charge-to-mass ratio of hydrogen 
ions (Rutherford, 1903). Rutherford took up the Langworthy Professorship of Physics at 
Manchester University in 1907 and, in the following year, demonstrated convincingly that 
a-particles are helium nuclei (Rutherford and Royds, 1909). A source of a-particles was 
inserted into a fine glass tube which could be inserted into an evacuated discharge tube. 
Before this ‘needle’ was inserted, when a high voltage was maintained across the discharge 
tube, no evidence for helium was observed. Once the needle was inserted into the tube, 
the a-particles passed through the thin walls of the glass tube, which were only 0.01 mm 
in thickness, and the characteristic lines of helium were observed in the discharge tube 
(Fig. 4.2). This was convincing evidence that w-particles are the nuclei of helium atoms. 
The discovery of the nuclear structure of atoms resulted from a brilliant series of exper- 
iments carried out by Rutherford and his colleagues, Hans Geiger and Ernest Marsden, in 
the period 1909-1912. Rutherford had been impressed by the fact that a-particles could 
pass through thin films rather easily, suggesting that much of the volume of atoms is empty 
space, although there was clear evidence for small-angle scattering. Rutherford persuaded 
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(b) 
r Fig.4.2 | (a) The apparatus with which Rutherford and Royds (1909) demonstrated that a-particles are the nuclei of helium 
atoms. The fine glass tube containing the source of &-particles, a sample of radium, is labelled A. (b) The original 
experiment in the Cavendish museum. 


Marsden, who was still an undergraduate, to investigate whether or not a-particles were 
deflected through large angles on being fired at a thin gold foil target. To Rutherford’s 
astonishment, a few particles were deflected by more than 90°, and a very small number 
almost returned along the direction of incidence. In Rutherford’s words: 


‘It was quite the most incredible event that has ever happened to me in my life. It was 
almost as incredible as if you fired a 15-inch shell at a piece of tissue paper and it came 
back and hit you.’ (Andrade, 1964) 


Rutherford realised that it required a very considerable force to send the a-particle back 
along its track. In 1911 he hit upon the idea that, if all the positive charge were concentrated 
in a compact nucleus, the scattering could be attributed to the repulsive electrostatic force 
between the incoming a-particle and the positive nucleus. Rutherford was no theorist, but 
he used his knowledge of central orbits in inverse-square law fields of force to work out the 
properties of what became known as Rutherford scattering’ (Rutherford, 1911). The orbit 
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of the «-particle is a hyperbola, the angle of deflection & being 


ÅT €pMy 
cot $ =| ale | poe. (4.14) 





where po is the collision parameter, vo is the initial velocity of the a-particle and Z 
the nuclear charge. It is straightforward to work out the probability that the a-particle is 
scattered through an angle &. The result is 


4% 


1 
p(o) x — cosec 
vi 2. 


0 


(4.15) 


the famous cosec*($/2) law derived by Rutherford, which was found to explain precisely 
the observed distribution of scattering angles of the a-particles (Geiger and Marsden, 1913). 

Rutherford had, however, achieved much more. The fact that the scattering law was 
obeyed so precisely, even for large angles of scattering, meant that the inverse-square law 
of electrostatic repulsion held good to very small distances indeed. They found that the 
nucleus had to have size less than about 107!4 m, very much less than the sizes of atoms, 
which are typically about 107!° 

Rutherford attended the First Solvay Conference in 1911, but made no mention of his 
remarkable experiments, which led directly to his nuclear model of the atom. Remarkably, 
this key result for understanding the nature of atoms made little impact upon the physics 
community at the time and it was not until 1914 that Rutherford was thoroughly convinced 
of the necessity of adopting his nuclear model of the atom. Before that time, however, 
someone else did — Niels Bohr, the first theorist to apply successfully quantum concepts to 
the structure of atoms. 


4.4 Haas’s and Nicholson’s models of atoms 
EEF) 


Bohr was not, however, the first physicist to attempt to introduce quantum concepts into 
the construction of atomic models. In 1910, a Viennese doctoral student, Arthur Erich 
Haas, realised that, if Thomson’s sphere of positive charge were uniform, an electron would 
perform simple harmonic motion through the centre of the sphere, since the restoring force 
at radius r from the centre would be, according to Gauss’s theorem in electrostatics, 


fume =- LEN _ ( eQ )r (4.16) 


Ar Eor? Areva? 





where a is the radius of the atom and O the total positive charge. For a hydrogen atom, for 
which Q = e, the frequency of oscillation of the electron is 


1 e2 1/2 
= — | — : 4.17 
27 (cna) GAD 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:52:44 GMT 2014. 
http://dx.doi.org/10.1017/CB09781139062060.005 
Cambridge Books Online © Cambridge University Press, 2014 





80 


The Bohr model of the hydrogen atom 


Haas argued that the energy of oscillation of the electron, E = e?/4rreya, should be quan- 
tised and set equal to hv. Therefore, 


=, (4.18) 


Haas used (4.18) to show how Planck’s constant could be related to the properties of atoms, 
taking for v the short wavelength limit of the Balmer series, that is, allowing m — oo in 
the Balmer formula (1.17) (Haas, 1910a,b,c). Haas’s efforts were discussed by Lorentz at 
the 1911 Solvay Conference, but they did not attract much attention. According to Haas’s 
approach, Planck’s constant was simply a property of atoms as described by (4.18), whereas 
those already converted to quanta preferred to believe that A had much deeper significance. 

The next clue was provided by the work of the Cambridge physicist John William 
Nicholson, who arrived at the concept of the quantisation of angular momentum. Nicholson 
(1911, 1912) had shown that, although the Saturnian model of the atom is unstable for 
perturbations in the plane of the orbit, perturbations perpendicular to the plane are stable 
for orbits containing up to five electrons — he assumed that the unstable modes in the 
plane of the orbit were suppressed by some unspecified mechanism. The frequencies of 
the stable oscillations were multiples of the orbital frequency and he compared these with 
the frequencies of the lines observed in the spectra of bright nebulae, particularly with 
the ‘nebulium’ and ‘coronium’ lines. Performing the same exercise for ionised atoms with 
one less orbiting electron, further matches to the astronomical spectra were obtained. The 
frequency of the orbiting electrons remained a free parameter, but when he worked out 
the angular momentum associated with them, Nicolson found that they turned out to be 
multiples of h/2rr. When Bohr returned to Copenhagen from England in 1912, he was 
perplexed by the success of Nicholson’s model, which seemed to provide a successful, 
quantitative model for the structure of atoms and which could account for the spectral lines 
observed in astronomical spectra. 


4.5 The Bohr model of the hydrogen atom 


Niels Bohr completed his doctorate on the electron theory of metals in 1911. Even at that 
stage, he had convinced himself that this theory was seriously incomplete and required 
further mechanical constraints on the motion of electrons at the microscopic level. He spent 
the following year in England, working for seven months with Thomson at the Cavendish 
Laboratory in Cambridge, and four months with Rutherford in Manchester. Bohr was 
immediately struck by the significance of Rutherford’s model of the nuclear structure of the 
atom and began to devote all his energies to understanding atomic structure on that basis. 
He quickly appreciated the distinction between the chemical properties of atoms, which 
are associated with the orbiting electrons, and radioactive processes which are associated 
with activity in the nucleus. On this basis, he could understand the nature of the isotopes 
of a particular chemical species. Bohr also realised from the outset that the structure of 
atoms could not be understood on the basis of classical physics. The obvious way forward 
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was to incorporate the quantum concepts of Planck and Einstein into the models of atoms. 
Einstein’s statement, quoted in Sect. 3.4, 


*... for ions which can vibrate with a definite frequency, . . . the manifold of possible states 
must be narrower than it is for bodies in our direct experience.’ (Einstein, 1906c) 


was precisely the type of constraint which Bohr was seeking. Such a mechanical con- 
straint was essential to understand how atoms could survive the inevitable instabilities 
according to classical physics. How could these ideas be incorporated into models of 
atoms? 

In the summer of 1912, Bohr wrote an unpublished memorandum for Rutherford, in 
which he made his first attempt at quantising the energy levels of the electrons in atoms 
(Bohr, 1912). He proposed relating the kinetic energy T of the electron to the frequency 
v’ = v/2ra of its orbit about the nucleus through the relation 


T = im? = Kv’, (4.19) 


where K is a constant which he expected would be of the same order of magnitude as 
Planck’s constant h. Bohr believed there must be some such non-classical constraint in 
order to guarantee the stability of atoms. Indeed, his criterion (4.19) absolutely fixed the 
kinetic energy of the electron about the nucleus. For a bound circular orbit, 


mv? Ze? 


a Arena? en 


where Z is the positive charge of the nucleus in units of the charge of the electron e. As is 
well known, the binding energy of the electron is 
Ze? Ze? U 


E=T+U = !m.v = Ze, 4.21 
2 4r ega 8T Eeoa 2 ( ) 





where U is the electrostatic potential energy. The quantisation condition (4.19) enables 
both v and a to be eliminated from the expression for the kinetic energy of the electron. A 
straightforward calculation shows that 
mZ?e* 
T= Ta (4.22) 
3263K 
which was to prove to be of central significance for Bohr. His memorandum containing 
these ideas was principally about issues such as the number of electrons in atoms, atomic 
volumes, radioactivity, the structure and binding of diatomic molecules and so on. There 
is no mention of spectroscopy, which he and Thomson considered too complex to provide 
useful information. 
The breakthrough came in early 1913, when Hans Marius Hansen told Bohr about the 
Balmer formula for the wavelengths, or frequencies, of the spectral lines in the spectrum 


of hydrogen, 
l v 1 1 
=-= Ro (= — =) F (4.23) 
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where Ræ = 1.097 x 10’ m”! is the Rydberg constantandn = 3, 4, 5, ... As Bohr recalled 
much later, 


‘As soon as I saw Balmer’s formula, the whole thing was clear to me.’ (Bohr, 1963) 


He realised immediately that this formula contained within it the crucial clue for the 
construction of a model of the hydrogen atom, which he took to consist of a single negatively 
charged electron orbiting a positively charged nucleus. He went back to his memorandum 
on the quantum theory of the atom, in particular, to his expression for the binding energy, 
or kinetic energy, of the electron (4.21). He realised that he could determine the value of 
his constant K from the expression for the Balmer series. The running term in 1 /n? can be 
associated with (4.23), if we write for hydrogen with Z = 1, 


met 


= 2. 4.24 
32edn?K? Se 


Then, when the electron changes from an orbit with quantum number n to that with n = 2, 
the energy of the emitted radiation would be the difference in kinetic energies of the two 
states. Applying Einstein’s quantum hypothesis, this energy should be equal to Av. Inserting 
the numerical values of the constants into (4.24), Bohr found that the constant K was exactly 
h/2. Therefore, the energy of the state with quantum number n is 


meet 


E=-T=- ; 4.25 
8Seên?h? (uzol 





The angular momentum of the state could be found immediately by writing T = 4I o’ = 
Sr'm.a?v”, from which it follows that 
, nh 

J= lw = z 
This is how Bohr arrived at the quantisation of angular momentum according to the old 
quantum theory. Perhaps most spectacularly, the theory enabled the value of the Rydberg 
constant to be expressed in terms of fundamental physical constants. From (4.25), it follows 
immediately that 


(4.26) 


= Me sim (4.27) 
gene ` f 
In the first paper of his famous trilogy (Bohr, 1913a,b,c), Bohr acknowledged that Nicholson 
had discovered the quantisation of angular momentum in his papers of 1912. These results 
were the inspiration for what became known as the Bohr model of the atom. 

In addition to the Balmer series of hydrogen, Bohr’s expression could account for the 
Paschen series of hydrogen which had been discovered in the near-infrared region of the 
spectrum by Friedrich Paschen in 1908 (Paschen, 1908). In this case, Bohr’s formula became 

l vo R 1 1 (4.28) 

a ce Nm m)’ ` 
where again Rx = 1.097 x 10’ m7! but now m = 3 and n = 4, 5, 6, ... The formula also 
predicted a series of lines with m = 1 and n = 2,3, 4,..., which was discovered by 
Theodore Lyman in 1914, the Lyman series of hydrogen (Lyman, 1914). 
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4.6 The X-ray spectra of the chemical elements 


In the first paper of the trilogy of 1913, Bohr noted that a similar formula to (4.23) 
could account for the Pickering series, which had been discovered in 1896 by Edward 
Pickering in the spectra of stars (Pickering, 1896). In 1912, Alfred Fowler discovered 
similar series in laboratory experiments (Fowler, 1912). Bohr argued that singly ionised 
helium atoms would have exactly the same spectrum as hydrogen, but the wavelengths of the 
corresponding lines would be four times shorter, as observed in the Pickering series. Fowler 
objected, however, that the ratio of the Rydberg constants for singly ionised helium and 
hydrogen was not 4, but 4.00163 (Fowler, 1913a). Bohr realised that the problem arose from 
neglecting the contribution of the mass of the nucleus to the computation of the moments 
of inertia of the hydrogen atom and the helium ion. If the angular velocity of the electron 
and the nucleus about their centre of mass is w, the condition for the quantisation of angular 
momentum is 


= uwR?, (4.29) 


where u = memyn/(m. + my) is the reduced mass of the atom, or ion, which takes account 
of the contributions of both the electron and the nucleus to the angular momentum; R is 
their separation. Therefore, the ratio of Rydberg constants for ionised helium and hydrogen 
should be 





Me 
+ 
R 
He A M | = 4.00160, (4.30) 
Ri 1+— 
4M 


where M is the mass of the hydrogen atom (Bohr, 1913d). Thus, precise agreement was 
found between the theoretical estimates and laboratory measurements of the ratio of Ry- 
dberg constants for hydrogen and ionised helium. In a further paper to Nature, Fowler 
acknowledged that Bohr’s formula was indeed a better and more elegant explanation of the 
lines observed in the spectrum of singly ionised helium (Fowler, 1913b). 

Bohr’s theory of the hydrogen atom was a quite remarkable achievement and the first 
convincing application of quantum concepts to atoms. Bohr’s dramatic results were per- 
suasive evidence for many scientists that Einstein’s quantum theory had to be taken really 
seriously for processes occurring on the atomic scale. In his biography of Bohr, Pais (1985) 
recounts the story of Hevesy’s encounter with Einstein in September 1913. When Einstein 
heard of Bohr’s analysis of the Balmer series of hydrogen, he remarked cautiously that 
Bohr’s work was very interesting, and important if right. When Hevesy told him about the 
helium results, Einstein responded, 


“This is an enormous achievement. The theory of Bohr must then be right.’ 


4.6 Moseley and the X-ray spectra of the chemical elements 
E) 


Support for Bohr’s model was not long in coming. Barkla continued his studies of the 
scattered X-ray emission of different elements and in 1908 he and Charles Sadler discovered 
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arome Wucnr OF FLUORESCENT FERENT. en 
Barkla’s summary of his experiments on the X (left curve) and L components (right curve) of fluorescent X-rays from 
samples of different elements plotted against atomic weight (Barkla, 1911b). The ordinate is the logarithm of the 
quantity A / o where A is the absorption coefficient defined by / = he". 


that each element had a characteristic X-ray signature which was correlated with the atomic 
weight of the material (Barkla and Sadler, 1908). In these experiments, the absorption of 
the X-rays by thin aluminium sheets was used to measure the ‘hardness’ or ‘softness’ of 
the X-ray emission. For a number of elements the fluorescent emission consisted of two 
components, a ‘soft’? component which was readily absorbed and a ‘hard’ component which 
suffered very much less absorption. In 1911, he summarised the results of his numerous 
absorption experiments (Fig. 4.3), demonstrating that the materials had both hard and soft 
components, which he labelled X and L (Barkla, 1911b). 

Henry Moseley was a member of Rutherford’s team in Manchester, but rather than 
working on radioactivity, he studied the characteristic X-ray emission of the elements. 
Barkla’s experiments provided a rough indication of the spectra of fluorescent X-rays but 
the picture changed dramatically with von Laue’s discovery of the diffraction of X-rays by 
crystals (Sect. 2.2.1). Von Laue and his colleagues used the crystal materials as transmission 
gratings in which the diffracted X-rays passed through the crystal and the diffraction pattern 
was recorded on a photographic plate. In contrast, William and Lawrence Bragg realised 
that pure crystal samples such as rock salt could be used as a diffraction grating in which the 
spectrum of the X-rays could be found in reflection according to Bragg’s law nd = 2d sind, 
where X is the wavelength of the radiation, d is the lattice spacing and 0 is the angle between 
the crystal planes and the direction of incidence of the X-ray beam; n is an integer. The 
invention of the X-ray spectrometer enabled X-ray spectra to be recorded on photographic 
plates with high spectral resolution (Fig. 4.4). 

In Moseley’s experiments, different pure materials were inserted into an X-ray tube 
and the reflected X-rays were analysed spectroscopically by reflecting them from a care- 
fully prepared sample of rock salt. He discovered that the reflected spectrum consisted of 
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A diagram illustrating the operation of William and Lawrence Bragg's rotating crystal X-ray spectrometer. This 
apparatus was used by Moseley in his experiments in which the nature of the K and L lines of the elements was 
elucidated (Sommerfeld, 1919). 


continuum radiation superimposed upon which were strong X-ray lines. The continuum 
radiation was the bremsstrahlung, or braking radiation, of the energetic electrons deceler- 
ated in the material. The lines were responsible for the K and L components of the X-ray 
emission identified by Barkla. The K lines were split into two components which Moseley 
labelled Ką and Kg. Corresponding splittings were observed in the L lines at somewhat 
longer wavelengths. The Ką line was about five times stronger than the Kg line but the Kg 
lines had frequencies about 10% greater than those of the K, lines. 

By this date, it was well known that the number of electrons in atoms was roughly 
half the atomic weight and was correlated with the position of the element in the periodic 
table. It was also known that the atomic weights of many of the successive elements in 
the periodic table differed by about two mass units. In 1913, the Dutch lawyer Antonius 
Johannes van den Broek made the proposal that, starting with Z = 1 for hydrogen, each 
atom is characterised by the number of electrons which is exactly equal to the sequential 
order of the elements in the periodic table (Fig. 1.2) (van den Broek, 1913). Since there 
were gaps of more than two mass units in the periodic table, van den Broek proposed that 
these were filled by as yet undiscovered elements. 

The most spectacular result of Moseley’s experiments was the discovery of the correlation 
between the frequency of the X-ray lines and the atomic number Z, corresponding to the 
number of electrons in the neutral atom (Moseley, 1913, 1914). In his papers, he plotted 
these correlations separately for the Ky, Kg, Lo and Lg lines (Fig. 4.5). The remarkable 
linear correlation between the square root of the frequencies of the lines and the atomic 
number had a number of crucial consequences. Moseley wrote the correlation between the 
square root of the frequencies of the Ka lines and the atomic number in the somewhat 
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Moseley’s correlation diagram between frequency of the various X-ray lines and atomic number. The Ky, Kg lines 
(bottom half of diagram) and La, Lg lines (top half of diagram) are plotted separately and show perfect correlations 
with atomic number (Moseley, 1913, 1914). 


provocative form 
; of 1 1 
Ka lines: vy = Ro(Z — 1) PTZ)’ (4.31) 


where Ra is the same constant which appears in Rydberg’s formula. For the L series, the 
correlations were described by 


. 1 1 
Ly lines: va = Ro(Z — 7.4) (=: = =) : (4.32) 
Moseley wrote to Bohr in November 1913 that these results are ‘extremely simple and 
largely what you would expect’. By this he meant that a number of features of these 
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empirical formulae could be immediately explained in terms of Bohr’s model of the atom. 
Rewriting Bohr’s formula for the frequencies of the spectral lines for a nucleus of charge 
Ze, we can write 


v 2 
YRZ (= 2 =) , (4.33) 
C 


with m and n different values of the quantum number, generically to be called n, char- 
acterising the stationary states. The Ky line would correspond to transitions between the 
stationary states with n = 1 and n = 2, while the La lines would correspond to transitions 
from n = 3 ton = 2. For the X, line, this would occur if an electron were removed from 
the n = 1 orbit and was replaced by an electron making a transition from the n = 2 to the 
n = 1 state. Similarly, the La lines would arise from an electron being removed from 
the n = 2 orbit and being replaced by one from the n = 3 orbital. It was a puzzle why 
the dependence upon atomic number should be (Z — 1)? and (Z — 7.4)? rather than Z?. 
This was to remain a puzzle until the screening of the nucleus by the inner electrons was 
fully appreciated, but this realisation required a number of key additional features of atomic 
structure which were to be discovered over the following decade. Rutherford immediately 
appreciated the significance of Moseley’s discovery. In his words, 


‘The original suggestion of van den Broek that the charge of the nucleus is equal to the 
atomic number and not to half the atomic weight seems to me very promising. The idea 
has already been used by Bohr in his theory of the constitution of atoms. The strongest 
and most convincing evidence in support of this hypothesis will be found in a paper by 
Moseley in the Philosophical Magazine of this month. He there shows that the frequency 
of the X-radiations from a number of elements can be simply explained if the number of 
unit charges on the nucleus is equal to the atomic number. It would appear that the charge 
of the nucleus is the fundamental constant which determines the physical and chemical 
properties of the atom, while the atomic weight, although it approximately follows the 
order of the nuclear charge, is probably a complicated function of the latter depending 
upon the detailed structure of the nucleus.’ (Rutherford, 1913) 


Conclusions (4) and (5) of Moseley’s paper of 1914 read: 


(4) The order of the atomic numbers is the same as that of the atomic weights, except 
where the latter disagrees with the order of the chemical properties. 

(5) Known elements correspond with all the numbers between 13 and 79 except three. 
There are here three possible elements still undiscovered. 


Conclusion 4 resulted in the reordering of the elements nickel (Z = 28) and cobalt (Z = 27), 
in accord with their chemical properties. Conclusion 5 resulted in the prediction of ele- 
ments with atomic numbers 43, 61 and 75 (see Fig. 4.5) which were only discovered many 
years later as the elements technetium (Tc), promethium (Pm) and rhenium (Re) respec- 
tively. Tragically, Moseley was killed in action during the Gallipoli campaign in Turkey on 
10 August 1915. 
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Franck and Hertz’ measurements to determine the ionisation potential of mercury vapour. In fact, the maxima in the 
graph correspond to the excitation of electrons from the ground state to the first excited state 4.9 eV above the ground 
level (Franck and Hertz, 1914). 


4.7 The Franck-Hertz experiment 





The reality of the stationary states within atoms was reinforced by the experiments of 
James Franck and Gustav Hertz (Franck and Hertz, 1914). Their objective was to measure 
the ionisation potentials of atoms by bombarding them with electrons which had been 
accelerated through a precisely known electrostatic potential. In their classic experiment of 
1914, they aimed to measure the ionisation potential of mercury vapour. Their results are 
shown in Fig. 4.6 in which, at small voltages, the current plotted on the ordinate increases as 
the energy of the electrons measured in electron volts increases. At an accelerating voltage 
of 4.9 eV, however, there is a sudden decrease in the current. As the voltage is increased 
further, the current again increases and then suddenly decreases at 9.8 eV. In fact, these 
steep decreases were found at integral multiples of the voltage of the first decrease at 4.9 eV. 
In addition, they found evidence for an emission line at a wavelength of 253.6 nm, which 
would correspond to an energy of hv = 4.9 eV. They interpreted 4.9 eV as the ionisation 
potential of mercury vapour. 

If this were correct it would have contradicted the basic postulates of the Bohr model of 
the atom since the mercury spectrum also displayed higher frequency Paschen lines with 
a series limit at 185.0 nm. Bohr interpreted the results of their experiments differently. 
He argued that the energy of 4.9 eV corresponded to the excitation energy of an electron 
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from the ground state to the first excited state and that the line at 253.6 nm represented 
the emission associated with the transition of an electron from the first excited state to fill 
the vacancy in the ground state (Bohr, 1915). The steep decreases at multiples of the first 
excited state corresponded to the electron being accelerated through a sufficient voltage to 
remove two or more successive electrons from the ground state. In due course, Franck and 
Hertz agreed with Bohr’s interpretation. 

The significance of the Franck—Hertz experiment was that it provided compelling evi- 
dence for the existence of stationary states within atoms, quite independent of the spectro- 
scopic data. 


4.8 The reception of Bohr’s theory of the atom 


The experiments and their interpretation described in the last three sections indicated 
that, whatever reservations theorists might have had about the underlying physics of the 
Bohr model of the atom, the concepts of quantisation and stationary states within atoms 
had to be taken seriously. Despite the concerns about the stability of atoms according 
to Bohr’s picture, many results were beginning to fall into place and the theorists could 
not neglect them. The succeeding years would witness a major increase in activity in 
experimental and theoretical physics dedicated to the study of quanta. The Bohr picture 
contained only a single quantum number n which labelled the energies of the stationary 
states. This was the first step along a chain of events which would lead to the appreciation 
that more quantum numbers and selection rules would have to be introduced to understand 
the plethora of quantum phenomena observed in atoms. In these endeavours, Bohr was 
fortunate in involving Sommerfeld in tackling these problems. 
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5.1 Introduction 


90 


Bohr’s success in accounting for the frequencies observed in the spectral series of hydrogen 
was rightly regarded as a triumph, despite the fact that it violated the classical laws of 
mechanics and electromagnetism. It could not account, however, for the spectra of helium 
and heavier elements. The Bohr model was the simplest possible model for the dynamics 
of a single electron in the electrostatic potential of a positively charged point nucleus, in 
that it involved only quantised circular orbits defined by a single quantum number n, what 
became known as the principal quantum number. At the 1911 Solvay Conference, before 
Bohr’s announcement of his model for the hydrogen atom, Poincaré had raised the issue 
of how the quantisation conditions could be extended to systems of more than one degree 
of freedom. The problem was attacked by both Planck and Sommerfeld. Their approaches 
ended up being essentially the same, although expressed in somewhat different language. 
We will follow Sommerfeld’s approach. 

In 1891 Michelson had shown that the Ha and H£ lines of the Balmer series dis- 
played very narrow splittings (Michelson, 1891, 1892). Although incompatible with Bohr’s 
theory, the problem was set aside in the face of the other remarkable successes of the 
theory. Sommerfeld suspected that the explanation lay in the fact that Bohr’s quantisa- 
tion condition involved only a single degree of freedom. In his papers of 1915 and 1916, 
he extended the quantisation of the orbits of the electron to more than one degree of 
freedom and accounted for the splitting of the lines of the Balmer series once a spe- 
cial relativistic treatment of the model was adopted (Sommerfeld, 1915a,b, 1916a). These 
advances are beautifully described in his influential book Atombau und Spektrallinien 
(Atomic Structure and Spectral Lines) (Sommerfeld, 1919) which was based upon his lec- 
ture courses on atomic spectra and their interpretation. Van der Waerden (1967) remarks 
that, 


‘it was mainly from this book that the young physicists who created Quantum Mechanics 
in 1925—26 learned Quantum Theory.’ 


Let us demonstrate exactly what Sommerfeld achieved. 
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Illustrating the geometry of an ellipse with eccentricity € = 0.5 in the (r, &) coordinate system used in Sect. 5.2. 


5.2 Sommerfeld’s extension of the Bohr model to elliptical orbits 
[ee ee ey 


The obvious extension of the Bohr model was to elliptical rather than circular orbits. The 
equation for an ellipse in pedal, or (r, &), coordinates is 


À 
—=1+€cos¢, (5.1) 
r 


where A = a(1 — e?) is the semi-latus rectum, a is the semi-major axis of the ellipse and € 
is its eccentricity. r is the radial distance from a focus to a point on the ellipse and & is the 
angle between the major axis and the radius vector r. The semi-minor axis of the ellipse is 
b = a(1 — e?)!/?. This geometry is illustrated in Fig. 5.1. 

It is apparent that there are now two independent parameters which define the orbit of 
the electron, r and ¢. The Bohr model involves the quantisation of the angular coordinate 
¢ through the quantisation of angular momentum mevr = ngh/2m = ngh, where r and v 
are constants.! Bohr and Sommerfeld realised that this quantisation condition could also 
be written in the form 


2r 
Í py dd = noh , (5.2) 


where pg is the angular momentum and ¢ the azimuthal angle. Sommerfeld, widely recog- 
nised as one of the leading mathematical physicists in Europe, appreciated that pg and & 
are canonical coordinates in Hamiltonian mechanics and so (5.2) provides the prescription 
for the quantisation condition of such pairs of coordinates. As expressed by Jammer (1989), 


‘Sommerfeld postulated that the stationary states of a periodic system with f degrees of 
freedom are determined by the condition that the ‘phase integral for every coordinate is 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:52:52 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781 139062060.006 
Cambridge Books Online © Cambridge University Press, 2014 





92 


Sommerfeld and Ehrenfest 


an integral multiple of the quantum of action’ or that fork = 1, 2,..., f, 


f an: (5.3) 


where p is the momentum corresponding to the coordinate q, ng is a non-negative 
integer and the integration is extended over a period of qx.’ 


Specifically, in the case of the elliptical orbits, the generalisation of the quantum conditions 
to two coordinates becomes 


27 

f pPedọ =ngh and p,dr=n,h, (5.4) 
0 orbit 

where ng and n, are non-negative integers. In polar coordinates q; = (r, $), the kinetic 

energy T of the electron is 


T= Teg? +r282). (5.5) 


Then, according to the prescription developed in Sect. 5.4.3 (equations 5.52, 5.53 and 5.61), 
the corresponding momenta p = ƏT /dq, are 


Po = mero and p,= mer, (5.6) 


which are the azimuthal angular momentum and radial momentum of the electron respec- 
tively. 

The quantum condition for ($, &) yields exactly the same result given by the circular 
Bohr model, namely, 


20 
f Po do = ngh ; 2 po = ngh or Po = ngh : (5.7) 
0 


Ps is therefore a constant of the motion. To derive the quantum condition for the radial 
component of the momentum, we first rewrite (5.1) as follows: 


1 ILl+ecos¢ 








5.8 
r a l-e oe) 
Taking the derivative of (5.8) with respect to & and then dividing by (5.8), 
1 dr __€ sind l (5.9) 
r db l-+ecos® 
We now need to work out p, dr. This is achieved by the following relations: 
=m =m—d=——; dr=—de. 5.10 
le rer T (5.10) 
Therefore, using (5.9), 
TEN fal ra are (5.11) 
= r do u; (1+ €cos@)? ` ` 
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Unlike pg, p, is not a constant of the motion. The quantum condition for the radial 
component of the momentum therefore becomes 


2r sin? & 
„dr = pge? f ———_"___dg=n,h. 5.12 
fe Pes o (l+ecosd) man m 
Dividing by the quantum condition (5.7), we find 
n, e ii sin? & 
= dd. 5.13 
ng 20 [ (1+ € cosp)? i GP) 


This is the result we have been seeking. The right-hand side of (5.13) depends only upon 
the eccentricity € and, because ng and n, are integers, the ellipticity is also quantised. 
Sommerfeld (1919) evaluated the integral (5.13) in Section 6 of the Mathematical Notes 
and Addenda of his book and found the result 








2 
1-2= em; (5.14) 
(ng + ny)? 
indicating that, of all possible ellipticities, only those involving the integers ng and n, are 
allowed. 
Next, the energies of the orbits are evaluated. The kinetic energy of the electron is 








€ l 1 ; 
T= Meg? + 72g) = p+ Po : (5.15) 
2 2me r2 
Using (5.10), this can be rewritten 
2 2 
Po 1 dr 
= -— 1f. 5.16 
2mer? (: =) $ l 29) 
Next, we use (5.8) and (5.9) to eliminate r from this expression, 
2 2 
Po l+e 
T= mal =e | 5 + eos] . (5.17) 


The potential energy of the electron can also be written in a form independent of r using 
(5.8): 
Ze Ze? 1+ecosd 


U= = : 5.18 
4T Eor 4rega 1-e (eta) 





where the charge of the nucleus is Ze, in other words, Z is the atomic number of the 
nucleus. Therefore, the total energy of the electron in its elliptical orbit is 


P5 l+e 
m,a*(1 — e°} 2 
The total energy is time-independent and so must be independent of & which changes by 


2x radians per orbit. The terms in cos & in (5.19) must therefore sum to zero and so we can 
find the value of a 


Ze? 1+ecos¢ 
4rega 1-e 








Ea=T+U= + cos (5.19) 


Are Dj 


= —_—__" __, 2 
Ze*m,.(1 — €?) Cm 
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With the terms in cos & set to zero, the total energy of the electron is 








2 2 2 
ps l+e Ze ı 
Ew = T +U = 5.21 
pi = mea2(1 — €?) | 2 | Anega 1 — e? m) 
Using the result (5.20) for a, we find the simple expression, 
Ze? 
Ewt = — (5.22) 
87 eoa 


recalling that a is the semi-major axis of the ellipse. The final step is to work out the 
quantised values of a using (5.7) and (5.14), 
Ze? Zetm.(l- €?) Z’e'm. 


E= = = A 5.23 
87 eoa 327?e ph 8e5h?(ng + n,) 





This is the remarkable result obtained by Sommerfeld in 1916. Even in his book of 1919, 
he can scarcely contain his excitement. He writes, 


“This result is of the greatest consequence and is superlatively simple: we have found 
for the energy of elliptical orbits the same value as ... for circular orbits, with the one 
difference that the quantum number n in the latter case is replaced by the quantum 
sum, ng +n,. Each of the quantised ellipses of our family has an amount of energy 
equivalent to that of a definite Bohr circle.’ 


This result is identical to (4.25) for the hydrogen atom for which Z = 1 and n, = 0. The 
range of potential quantum transitions has been greatly increased. Considering transitions 
between the upper (u) and lower (1) quantum states characterised by (ny, n,) and (nis, nl), 
evidently (nj +n,) must be greater than (n, +n!) and then the energy of the photon 
emitted in the transition is 


Zeme 1 1 (5.24) 
v= y P 
Begh? | (ahn) (n +n)? 





which for hydrogen becomes 


I 1 1 (5.25) 
a c ah tn (n+ nny | 


Thus, the spectral series is again identical with the Balmer and other series of hydrogen, but 
the number of ways in which the lines can be produced has been considerably increased. 
Sommerfeld’s words, again printed in italics, state: 





‘... it has a deepened theoretical significance and its origin now has multiple roots. By 
the admission of elliptical orbits, the series has gained no extra lines and has lost none 
of its sharpness.’ 


Sommerfeld developed his model of the atom using the following arguments. The case 
ng = 0 corresponds to a degenerate ellipse, € = 1, which is a straight line joining the foci 
of the ellipse and so this trajectory would pass though the nucleus. Sommerfeld excluded 
this possibility so that ng = 1, n, = 0 is the lowest energy state. Secondly, when n, = 0, 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:52:52 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.006 
Cambridge Books Online © Cambridge University Press, 2014 





95 


5.2 Sommerfeld’s extension of the Bohr model 





(n,+n,)=4 


The quantised circular and elliptical orbits parameterised by the quantum numbers ng and n, according to the 
Sommerfeld model. Orbits with the same values of (ng + n,) have the same energies. 


the orbits become circles € = 0 as can be seen from (5.14) for the ellipticity of the electron’s 
orbit. The resulting families of orbits for (ng + n,) = 1, 2, 3, and4 and n, = 1, 2, and3 
are shown in Fig. 5.2 in which the orbits have been drawn to scale with respect to the foci 
of the ellipses. 

But, much more has been achieved. Although the unperturbed hydrogen atom now has 
numerous ways of creating sharp Balmer lines, when the atom is placed in an electric or 
magnetic field, the lines associated with different initial and final states are split because of 
the different kinematics of the electron in the perturbed elliptical orbits. Specifically, the 
Stark and Zeeman effects should display plentiful fine structure (see Sects. 7.2 and 7.3). 
Consider for example the case of the transition (n4 + nt) = 4 to (n, + nl) = 3. We can 
combine any of the orbits with (nj + n,) = 4 in Fig. 5.2 with any of those with (ni + 
nl) = 3, in other words, there are (ng +n) x (n, +n!)=4x3= 12 ways in which 
the line can be produced. Likewise, for the Balmer series for which (nj + n,) =2 and 
(ni +n!) = 3, 4, 5, ... the degeneracies would be 3 x 2 = 6 for Ha, 4 x 2 = 8 for H£, 
5 x 2 = 10 for Hy, 6 x 2 = 12 for Hé and so on. Sommerfeld appreciated that not all of 
these would be realised in nature and so there had to be additional principles of selection, 
or selection rules, which determine which transitions are allowed. Initially, he provided a 
set of empirical rules to avoid excessive numbers of lines. 

Sommerfeld proceeded to the next generalisation of the model by considering quantisa- 
tion in three dimensions. Spherical polar coordinates are the natural extension of the polar 
coordinates considered so far, the three independent coordinates now being (r, 6, @), where 
@ is the polar angle measured from the axis of the (r, &) coordinate system used above. The 
quantisation conditions follow exactly the same prescription as (5.4), namely, 


27 Eig 
$ p,dr=n,h; § Po dd = neh; $ po dO = noh , (5.26) 
orbit 0 0 


where the new quantum number ng is associated with the conjugate pair (pọ, 0). Carrying 
out a similar analysis to the above two-dimensional calculation, Sommerfeld found that the 
energy levels had to satisfy the relation n = ng + ng, where n, ng, na are non-negative 
integers. Once again, the energy levels corresponded precisely to those of the hydrogen 
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atom with additional degenerate energy levels. If we denote the total angular momentum 
Py, the quantisation condition is 


nh  (ng+no)h 
an 2x , 





p= (5.27) 
but now the quantised ellipses can only be found at certain angles with respect to any 
specified direction. Thus, the component of angular momentum along a chosen direction 
is ngh/27 while the total angular momentum is nh/2r = (ng + ng)/27 and so the angle 
between the chosen direction and the total angular momentum vector is given by cosa = 
ng/n = ng/(ng + no). The inference is that the planes of the orbits of the ellipses can 
only take certain angles with respect to the chosen direction, which might be defined by an 
imposed electric or magnetic field. This was the discovery of space quantisation. 


5.3 Sommerfeld and the fine-structure constant 
SS) 


Perhaps the most remarkable results of Sommerfeld’s paper concerned the extension of the 
Bohr model to include the effects of special relativity. The quantisation conditions remain 
the same as in (5.4), 


27 
$ podp =ngh and pr dr =nyh. (5.28) 
0 orbit 

In Sommerfeld’s picture, the effect of relativity is to relieve the degeneracy of the orbits 
because the total energies of the orbits are now slightly different. A simple interpretation of 
the effect is that the elliptical orbits have different energies because the electrons acquire 
high speeds in close encounters with the nucleus and so the relativistic corrections differ 
from orbit to orbit. 

Sommerfeld’s analysis in his book Atombau und Spektrallinien (Sommerfeld, 1919) is a 
brilliant and clear exposition of the energy changes of the orbits of the electrons when the 
effects of special relativity are taken into account. The analysis is somewhat lengthy and 
so we summarise the key results of the analysis. The first realisation is that the ellipses are 
no longer stationary in space, but the perihelion of the electron’s orbit, the point of closest 
approach to the nucleus, precesses about the nucleus. This is illustrated by the diagram 
from Sommerfeld’s book (Fig. 5.3). In polar coordinates, Sommerfeld writes the expression 
for the precessing elliptical orbit as 


1 
—-=C,+Ccosy¢, (5.29) 
7 


which differs from the case of the stationary ellipse (5.1) by the inclusion of the factor y in 
the cosine term. The quantity y is defined to be 


y=1- 2, (5.30) 
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Illustrating the precession of the elliptical orbits when the effects of special relativity are taken into account. This 
diagram appears as Fig. 110 in Sommerfeld’s book Atombau und Spektrallinien (Sommerfeld, 1919). 


where p is now the relativistic three-momentum and is quantised according to the rule 
$ pdb = ngh. po is defined to be the quantity po = Ze? /Arregc. In the case of circular 
Bohr orbits, it is straightforward to show that y = (1 — v?/c?)!/”, in other words, the inverse 
of my normal convention for the Lorentz factor y = (1 — v*/c?)~'/?. Just for this section, 
we will use y in Sommerfeld’s sense. Inspection of (5.29) shows that, since y < 1, the orbit 
does not close up after & = 27 radians, but after y@ = 27 radians. This slight change per 
orbit is illustrated by the angle Ad in Fig. 5.3. The orbits return to the standard elliptical 
form however if we introduce the coordinate y = yd. Sommerfeld shows that the equation 
for the ellipse now becomes 





1 11+€cos yo 


» 5.31 
r a 1 — €? ( ) 
while the momenta corresponding to r, & are 

Po = mr’, p- =m, (5.32) 


where the momenta are relativistic three-momenta, that is, in (5.32) m = m.(1 — v?/c?)~!/”. 
The one important difference is that the quantisation condition in the radial direction now 
corresponds to the integration over a single ellipse in the y coordinate, that is, 


27 w=2n 
i podd =ngh and prdr=n,h. (5.33) 
o=0 y=0 
Carrying out the same procedure as in the non-relativistic case, Sommerfeld found that the 
ellipticities of the orbits are given by 
nz —a?Z? 
1-2= $ , (5.34) 
[m + ny —a?Z? 





where œ = e?/2eohc is the fine-structure constant, for reasons which will be apparent in 
a moment. It is already clear that the degeneracy of the energy levels has been relieved 
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because the ellipticities of the orbits now depend differently upon ng and n, as compared 
with (5.14). The result (5.14) is recovered in the non-relativistic limit, which is obtained by 
setting the fine-structure constant œ equal to zero. 
Next, the energies of the elliptical orbits are evaluated. In the two-dimensional case with 
quantum numbers ng and n,, the result is 
-1/2 


2 a? Z? 2 
Et = Mec’ 41+ Mec. (5.35) 


2 
In, + n? = a222] 





The value of the fine-structure constant œ is 1/137.036 and so (5.35) can be expanded to 
fourth order in « to find the relativistic expression for the energy levels of hydrogen-like 
atoms. Performing this calculation, 


E — Z?e'me 1 P æ? Z? n, z 1 (5.36) 
or Seoh? (ngn (ng+n Lng 4 ` i 





This expression has a number of remarkable features. First, in the non-relativistic limit 
c —> œ, a —> 0, the second term in curly brackets disappears and we recover (5.23) for 
the energies of the elliptical orbits. Secondly, the second term in curly brackets is entirely 
associated with the effects of special relativity. The term associated with the factor 1/4 in 
this term corresponds to a shift in energy of the energy level labelled (ng + n,) as a whole 
and does not result in any splitting of the energy level. In contrast, the term associated 
with the factor n,/ng results in splitting of the energy levels and Sommerfeld identified 
this as the cause of the splitting of the lines of the Balmer series. This is why the quantity 
a = e?/2eyhc = 1/137.036 is referred to as the fine-structure constant. 

For hydrogen, the effect is most pronounced in the optical waveband for the Ha line, 
in particular, with the splitting of the lower energy state with (ng + n,) = 2. The splitting 
of this level corresponds to the difference between the energies of the orbits with (ng = 
2, n, = 0)and(ng = 1, n, = 1). Inserting these values into (5.36), the difference in energy 
corresponds to a frequency shift 

ieee ee =] Zu a (5.37) 

"On Bek [16] 16 l 

This was found to be in good agreement with Michelson’s measurements of fine structure 
in the Ha and H£ lines of hydrogen. Even better agreement was found from Paschen’s 
measurement of the splitting of the Balmer line of ionised helium Het which has the same 
form of fine structure but the effect is 16 times greater because of the Z* factor in the 
expression for the fine-structure displacement of the energy levels. Referring Paschen’s 
result to the expected splitting for the hydrogen Balmer lines, the measured value was 
Avy = 0.3645 + 0.0045 cm. 

Sommerfeld next applied the theory of fine-structure splitting to the K and L X-ray 
fluorescence lines into (Ka, Kg) and (La, Lg) discussed in Sect. 4.6 and seen in Fig. 4.5. 
In the case of the K lines, Moseley’s formula (4.31) showed that the effective nuclear 
charge was (Z — 1) while (4.32) showed it was (Z — b) for L lines. Sommerfeld carried 
out his own analysis of the Z lines and found b = 3.5. Using the ‘shielded’ charges in the 
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expressions for the splitting of the L X-ray energy levels, the splittings expected for the 
elements shown in Fig. 4.5 were expected to be vastly greater than that of hydrogen because 
of the roughly Z* dependence upon the effective nuclear charge. Extrapolating from the Ly 
and Lg X-ray lines of the heavy elements back to the case of hydrogen, Sommerfeld again 
found Avy = 0.365 cm7!. 

Sommerfeld was rightly triumphant. He fully appreciated the fact that the success of 
the theory not only provided evidence for the elliptical orbits of electrons in atoms, but 
also provided a demonstration of the correctness of the special relativistic formula for the 
momentum of the electron. In his words: 


‘Thus, the observation of the fine-structure discloses the whole mechanism of the intra- 
atomic motions as far as the motion of the perihelion of the elliptic orbits. The complex of 
facts contained in the fine-structures has just the same importance for the special theory 
of relativity and for the atomic structure, as the motion of Mercury’s perihelion for the 
general theory of relativity.’ 


These results provided the basic elements for the old quantum theory which was to be 
greatly elaborated by Bohr and his associates. Sommerfeld took the generalised quantum 
conditions 


dm dqx = nh (5.38) 


to be the ultimate foundations of quantum theory, statements which were ‘unproved and 
perhaps incapable of being proved’. He also realised that much more powerful mathematical 
tools were already available for carrying out calculations in the quantum theory. In fact, these 
considerations were to lead to the natural mathematical language of quantum mechanics. 


5.4 A mathematical interlude — from Newton to Hamilton-Jacobi 
mm EEE 30 y? FE vg oz 


It is worthwhile reviewing how the methods of higher mechanics provided the route to the 
analysis of problems in quantum theory. As Sommerfeld remarked, 


‘It is truly a royal route for quantum problems.’ 


The history of the development of more and more powerful ways of expressing the 
content of Newton’s laws of motion can be appreciated from the historical sequence of 
approaches: 


e Newton’s laws of motion, 

e D’Alembert’s principle, 

« Hamilton’s principle, 

e The principle of least action, 

e Generalised coordinates and Lagrange’s equations, 

e The canonical equations of Hamilton, 

e Transformation theory of mechanics and the Hamilton—Jacobi equations, 
e Action—angle variables. 
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Very often, various more advanced approaches provide much more straightforward routes 
to the solutions to problems and provide a deeper appreciation of the basic features of 
dynamical systems, including conservation laws and normal modes of oscillation of a 
mechanical or dynamical system. 


5.4.1 Newton’s laws of motion and principles of ‘least action’ 


Some of the most powerful approaches involve finding that function which minimises 
the value of some quantity subject to well-defined boundary conditions. In the formal 
development of mechanics, the procedures are stated axiomatically. Let us take as an 
example what Feynman refers to as the principle of minimum action. Consider the case of 
the dynamics of a particle in a conservative field of force, that is, one which is derived as 
the gradient of a scalar potential, F = —grad V. 

We introduce a set of axioms which enables us to work out the path of the particle subject 
to this force field. First, we define the quantity 


L=T-V =}m°-V, (5.39) 


the difference between the kinetic energy T and the potential energy V of the particle in 
the field. To derive the trajectory of the particle between two fixed endpoints in the field in 
a fixed time interval rı to fa, we find that path which minimises the function 


h 7 dr \* 
S = (mv? = v) dt -| im (F) —V|dt. (5.40) 
ti ti 


These statements are to be regarded as equivalent to Newton’s laws of motion or, rather, 
what Newton accurately called his ‘axioms’. £ is called the Lagrangian. 

These axioms are consistent with Newton’s first law of motion. If there are no forces 
present, V = constant and hence we minimise S = Se v? dt. The minimum value of S must 
correspond to a constant velocity v between tı and fy. If the particle travelled between the 
endpoints by accelerating and decelerating in such a way that 7, and fh are the same, the 
integral must be greater than that for constant velocity between t| and ¢ because v? appears 
in the integral and a basic rule of analysis tells us that (v?) > (v)*. Thus, in the absence of 
forces, v = constant, Newton’s first law of motion. 

To proceed further, we need the techniques of the calculus of variations. Suppose f(x) 
is a function of a single variable x. The minimum value of the function corresponds to the 
value of x at which df(x)/dx is zero. Approximating the variation of the function about 
the minimum at x = 0 by a power series, 


f(x) = a + ax + ax? + ax? 4--- ; (5.41) 


the function can only have df/dx = 0 at x = 0 if a; = 0. At this point, the function is 
approximately a parabola since the first non-zero coefficient in the expansion of f(x) about 
the minimum is the term in x? — for small displacements x from the minimum, the change 
in the function f(x) is second order in x. 
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The same principle is used to find the path of the particle when S is minimised. If the 
true path of the particle is xo(¢), another path between 1; and h is given by 


x(t) = xolt) + nt) (5.42) 


where n(f) describes the deviation of x(t) from the minimum path xo(f). Just as we can 
define the minimum of a function as the point at which there is no first-order dependence 
of the function upon x, so we can define the minimum of the function x(t) as that function 
for which the dependence on 7(f) is at least second order, that is, there should be no linear 
term in n(f). 

Substituting (5.42) into (5.40), 


2| m (dxo dn \? 
s= | 5 (+3) V(xo + n) | dt 
-f oE ee (5.43) 
Ja | 2 |X de dt dt dt ME l l 


We now seek to eliminate first-order quantities in dy and hence we can drop the term 
(dy /dt)?. We expand V (xo + n) to first order in ņ by a Taylor expansion, 











V(xo +n) = V(x0)+ VV +9. (5.44) 


Substituting (5.44) into (5.43) and preserving only quantities to first order in 9, we find 


2 | m (dxo\* dxo dy 
S= lee eV eel at. 5.45 
MHG M ae ay 6.43) 


Now the first two terms inside the integral are the minimum path and so are a constant. We 
therefore need to ensure that the last two terms have no dependence upon 7, the condition 
for a minimum. We therefore need consider only the last two terms, 





h dxo dy 
S= -VV ) dt. 5.46 
J (me de dt nvr) 220) 
We integrate the first term by parts so that ņ alone appears in the integrand, that is, 
dxo 5 ard =) 
S= . . -VV | dt. 5.47 
mln |, [ E (m dt ae ne 


The function 7 must be zero at rı and h because that is where the path begins and ends and 
the endpoints are fixed. Therefore, the first term on the right-hand side of (5.47) is zero and 


we can write 
a d dxo 


This must be true for arbitrary perturbations about xo(f) and hence the term in square 


brackets must be zero, that is, 
d dxo 
— —)=-VV. 4 
eT) (549) 
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We have recovered Newton’s second law of motion since F = —V V, that is, 
d dxo dp 
= = ; 5.50 
dt (m dr ) dr Pew) 


Thus, our alternative formulation of the laws of motion in terms of an action principle is 
exactly equivalent to Newton’s statement of the laws of motion. 

These procedures can be generalised to take account of conservative and non- 
conservative forces such as those which depend upon velocity, for example, friction and the 
force on a charged particle in a magnetic field (Goldstein, 1950). The key point is that we 
have a prescription which involves writing down the kinetic and potential energies (7 and 
V respectively) of the system and then forming the Lagrangian £ and finding the minimum 
value of the function S. The advantage of this procedure is that it is often a straightforward 
matter to write down these energies in some suitable set of coordinates. We therefore need 
rules which tell us how to find the minimum value of S in any set of coordinates which is 
convenient for the problem in hand. These are the Euler-Lagrange equations. 


5.4.2 The Euler-Lagrange equations 


Let us consider a system of N particles interacting through a scalar potential function V. 
The positions of the N particles are given by the vectors [r1, r2, F3, ..., ry] in Cartesian 
coordinates. Since three numbers are needed to describe the position of each particle, 
for example, x;, Yi, Zi, the vector describing the positions of all the particles has 3N 
coordinates. For greater generality, we wish to transform these coordinates into a different 
set of coordinates which we write as [q1, 92, 93,---,93n]. The set of relations between 
these coordinates can be written 


qi = qi(f1, F2, F3,..., ry), andhence r; = r;(q1, q2, 93,---+93N) - (5.51) 


This amounts to no more than a change of variables. 

The aim of the procedure is to write down the equations for the dynamics of the particles, 
that is, an equation for each independent coordinate, in terms of the coordinates g; rather 
than r;. We are guided by the analysis of the previous subsection on action principles to 
form the quantities T and V, the kinetic and potential energies respectively, in terms of the 
new set of coordinates and then to find the stationary value of S, that is 


bh 
L=T-V, asa f (T —V)dt=0, (5.52) 
ti 
where ö means ‘take the variation about a particular value of the coordinates’ as discussed 
in the previous subsection. This formulation is called Hamilton 5 principle and as before 
L is the Lagrangian of the system. Notice that Hamilton’s principle makes no reference to 
the coordinate system to be used in the calculation. 
The kinetic energy of the system is 


PS mii}. (5.53) 
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In terms of our new coordinate system, we can write without loss of generality 
ri = ri(91,Q2,..:,g3n,t) and ři =F (G1, 92, G3, ++ +5 PBN q1, 42, +++ BNE) - 
(5.54) 


Notice that we have now included explicitly the time dependence ofr; and7;. Therefore, we 
can write the kinetic energy as a function of the coordinates g;, q; and t, that is, T (ġi, qi, £), 
where we understand that all the values of i from 1 to 3N are included. Similarly, we can 
write the expression for the potential energy entirely in terms of the coordinates q; and t, 
that is, V (q;, t). We therefore need to find the stationary values of 


tz th 
= f [T(ġi, qi, t) — V (qi, t)] dt = 1 L(gi, qi, t)dt . (5.55) 
ti ti 


We repeat the analysis of Sect. 5.4.1 in which we found the condition for S to be 
independent of first-order perturbations about the minimum path. As before, we let go(t) 
be the minimum solution and write the expression for another function q(t) in the form 


g(t) = qolt) + nl). (5.56) 
Now we insert the trial solution (5.56) into (5.55), 
bh 
S= J Llgott) + nt), qo(t) + nE), t] dt . 
ti 
Performing a Taylor expansion to first order in ġ)(t) and n(t), 
ee a fe aL 
S= f L[qo(t), go(t), t] dt + f milt) + nC) | de. (5.57) 
ti n LOGi ðqi 


Setting the first integral equal to Sp and integrating the term in 7(f) by parts, 


S= S t j : 1e t t) | dt 5.58 
= +], [ls sa) m) | 2.659) 


Again because n(f) must always be zero at the endpoints, the first term in square brackets 
disappears and the result can be written 


OL JL 
sn "| ()- vate 


We require this integral to be zero for all first-order perturbations about the minimum 
solution. Therefore, the condition is 


ƏL 6d (OL 
——{—]=0. 5.59 
dg; dt (5) 

















The equations (5.59) represents 3N second-order differential equations for the time evo- 
lution of the 3N coordinates and are known as the Euler-Lagrange equations. They are 
no more than Newton’s laws of motion written in the g; coordinate system which can be 
chosen to be the most convenient for the particular problem at hand. 
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5.4.3 Hamilton’s equations 


Another way of developing the equations of mechanics and dynamics is to convert (5.59) 
into a set of 6N first-order differential equations by introducing the generalised momenta 
pi. There was no mention of the components of the momenta of the particles p; in the last 
subsection but these can be introduced by taking the derivative of the Lagrangian (5.52) in 
Cartesian coordinates with respect to the component of velocity x;, and we find 


ƏL ƏT T aa 
Ox; Ox; Ox; Ox; OX; 2 





mj (%5 +97 +25) = mix; = pi . (5.60) 
J 

This expression indicates how the concept of momentum can be generalised within the 

framework of the Lagrangian approach to mechanics. The quantity 0£/0q; is defined to be 

pi, the canonical momentum, conjugate to the coordinate g;, by 

3L 

= 





Pi (5.61) 
The p; defined by (5.61) do not necessarily have the dimensions of linear velocity times 
mass ifg; is not in Cartesian coordinates. Furthermore, if the potential is velocity dependent, 
even in Cartesian coordinates, the generalised momentum is not identical with the usual 
definition of mechanical momentum. An example of this is the motion of a charged particle 
in an electromagnetic field in Cartesian coordinates for which the Lagrangian is 


1 
L= 2 mii? — Led) +) aAa) Fi, (5.62) 
where e; is the charge of the particle i, &(x;) is the electrostatic potential at the particle i 
and A(x;) the vector potential at the same position. Then, the x-component of the canonical 
momentum of the ith particle is 


Pix = ae = Mikr + e Ax , (5.63) 
OX; 
and similarly for the y- and z-coordinates. 

With these definitions, a number of conservation laws can be derived.* For example, if 
the Lagrangian does not depend upon the coordinate q;, (5.59) immediately shows that the 
generalised momentum is conserved, dp;/dt = 0, p; = constant. Another standard result 
is that the total energy of the system in a conservative field of force with no time-dependent 
constraints is given by the Hamiltonian H which can be written 


H =) piġi — Ll4, å). (5.64) 


It looks as though H depends upon p;,g; and q; but, in fact, we can rearrange the 
equation to show that H is a function of only p; and q;. Let us take the total differential of 
H in the usual way, assuming £ is time independent. Then 


£ £ 
dH = 2 P dgi + 24 dp; > val 5 ne dq; . (5.65) 





i 
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Since p; = 0£/0q;, the first and third terms on the right-hand side cancel. Therefore, 





dH =) qidp;—- >> dq; . (5.66) 


This differential depends only on the increments dp; and dq; and hence we can compare 
dH with its formal expansion in terms of p; and g;: 


dH = Sdn a. 


It follows immediately that 




















3a; = 3g; ; ap, = qi. (5.67) 
Since 
= = = (=) ; (5.68) 
dg; dt \ðåi 
we find from the Euler-Lagrange equation, 
0H . 
ðq =—Pi- 
We thus reduce the equations of motion to the pair of relations 
go gpa (5.69) 


= š P = 

ODi ; aqi 
This pair of equations is known as Hamilton 5 equations. They are first-order differential 
equations for each of the 3N coordinates. We are now treating the p;s and the g;s on the 
same footing. If V is independent of å, H is just the total energy T + V expressed in terms 
of the coordinates p; and q;. 


5.4.4 The Hamilton—Jacobi equations and action—angle variables 


Why do we have to delve deeper into the principles of classical mechanics? Hamilton’s 
equations (5.69) are often difficult to solve but they can be simplified to tackle specific 
problems. It turned out that the appropriate tools were available for application to the 
quantum theory through Carl Jacobi’s extension of Hamilton’s equations. In his classic 
textbook Classical Mechanics, Goldstein (1950) remarks: 


‘For a long time action—angle variables remained an esoteric technique of classical me- 
chanics used only by astronomers. The situation changed rapidly with the advent of Bohr’s 
quantum theory of the atom, for it was found that the quantum conditions could be stated 
most simply in terms of action variables. In classical mechanics the action variables pos- 
sess a continuous range of values, but this is no longer the case in quantum mechanics. 
The quantum conditions of Sommerfeld and Wilson required that the motion be limited 
to such orbits for which the “proper” action variables had discrete values which were 
integral multiples of h, the quantum of action.’ 
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These ‘esoteric’ procedures had been used with considerable success in astronomical prob- 
lems, for example, in Delauney’s Théorie du mouvement de la lune (1860, 1867) and in 
Charlier’s Die Mechanik des Himmels (1902). It is no surprise that Karl Schwarzschild, 
theoretical astrophysicist and the Director of the Potsdam Observatory, was one of the 
principal contributors to the introduction of Hamilton—Jacobi theory into the solution of 
problems in quantum theory. 

The mathematical procedures which result in the Hamilton-Jacobi equations and their 
application to astronomical and quantum problems are elegantly expounded by Goldstein 
(1950). We present here a simplified version of that story aimed at deriving the useful 
solutions as directly as possible. The essence of the approach is to transform from one 
set of coordinates to another, while preserving the Hamiltonian form of the equations of 
motion and ensuring that the new variables are also canonical coordinates. This approach 
is sometimes called the transformation theory of mechanics. Thus, suppose we wish to 
transform between the set of coordinates q; to a new set Q;. We need to ensure that the 
independent momentum coordinates simultaneously transform into canonical coordinates 
in the new system, that is, there should also be a set of coordinates P; corresponding to p;. 
Thus, the problem is to define the new coordinates (Q;, P;) in terms of the old coordinates 


(qi, Pi): 
0;=O0;.(49,p.t), Pi = Pilg. pt). (5.70) 


We need to find the transformations which result in the (Q;, P;) being canonical coordinates 
such that 


: OK B oK (5.1 
0; = p= 90; i ) 
where K is now the Hamiltonian in the (Q;, P;) coordinate system. The transformations 
which result in (5.71) are known as canonical transformations. They are also known as 
contact transformations. In the old system, the coordinates satisfied Hamilton’s variational 
principle, 





Ju Dad -Ha,pn|a=o, (5.72) 


and so the same rule must hold true in the new set of coordinates, 


sf > P:Ö;-K(0,P, | dt =0. (5.73) 


The equations (5.72) and (5.73) must be valid simultaneously. The clever trick is to note 
that the integrands inside the integrals in (5.72) and (5.73) need not be equal, but can differ 
by the total time derivative of some arbitrary function S. This works because the integral of 
S between the fixed endpoints is a constant and so disappears when the variations in (5.72) 
and (5.73) are carried out, 


2 ds 
af g tt = HS — 51) = 0. (5.74) 
ti 
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The function S is known as the generating function because once it is specified, the trans- 
formation equations (5.70) are completely determined. It looks as though S is a function 
of the four coordinates (g;, pi, Qi, P;), but in fact they are related by the functions (5.70) 
and so S can be defined by any two of (qi, Pi, Qi, Pi). Goldstein (1950) and Lindsay and 
Margenau (1957) show how the various transformations between the coordinate systems 
can be written. For example, if we choose to write the transformations in terms of p;, P;, 
the transformation equations are 
as as as 


a Boe d K=H+ 5.75 
er 30: 7 er Re) 








Alternatively, if we choose q;, P;, the transformation equations become 


> Q 2 ih K= H + > (5.76) 
> = wil = — 5 i 
ðqi OP; at 








Pi = 


The other two sets of transformations are included in the endnote.* 


Example Let us first work out the motion of a one-dimensional harmonic oscillator in this 
new formalism. The mass of the oscillator is m and the spring constant k. From these, we 
can define the angular frequency w by œ? = k/m. The Hamiltonian is therefore 


H = — + —. 5.77 
m 2 ( ) 


Now introduce a generating function defined by 
S= sr cot O, (5.78) 


which depends only upon g and Q. We can therefore use (5.75) to find the transformations 
between coordinate systems. There is no explicit dependence of S upon time ¢ and so, from 
the third equation of (5.75), H = K. We can now find the coordinates p and q in terms of 
the new coordinates P and Q. From the first and second equations of (5.75) 

as 


as ctQ, P D L "O 2 cosec? O (5.79) 
— = mo A = — = . . 
ar q 30721 


These can be reorganised to give expressions for p and q in terms of P and Q: 


2 
q=,/—VPsinQ, p=V2moVP cosQ. (5.80) 
mo 
Substituting these expressions for p and q into the Hamiltonian (5.77), we find 
2 
pP ka? 
H = K = — + —— P. 5.81 
T (5.81) 
We now find the solutions for P and Q from Hamilton’s equations 
Se oa (5.82) 
90’ aP’ l 
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and so, from (5.81), it follows that 
P=0,0=» andso P=a,Q=at+f, (5.83) 


where œ and ß are constants. These solutions can now be substituted into (5.80) and we 
find 


p = v2mwa cos(wt +8), g=,/ “a sin(wt + B). (5.84) 


Notice how economically we can determine the constants of the motion once we have found 
an appropriate generating function S. 


The question is then, ‘Can we find the appropriate generating functions for the problem 
at hand?’ For the cases in which we are interested in which there is no explicit dependence 
upon the time ¢ and in which the variables in the partial differential equations are separable, 
the answer is ‘Yes’. We begin with the relation between the Hamiltonian H and the total 
energy of the system E. For the cases we are interested in, 


H(q,p)=E. (5.85) 


We can now replace p; by its definition in terms of the partial differentials of the generating 


function S, p; = 0S/0q; so that 

as 

H (a. =) =E. (5.86) 
Ogi 

Writing out the partial derivatives, this is a set of n first-order partial differential equations 

for the generating function S, 





as as as 
) =E. (5.87) 


ðq’ 8q2° ðqn 
Equation (5.87) is known as the Hamilton-Jacobi equation in time-independent form. S is 
also referred to as Hamilton s principal function. 

We are interested in periodic solutions of the equations of motion for conservative 
motion under a central field of force. There is no explicit dependence of the coordinate 
transformations upon time and so the transformation equations (5.76) become 


H (argo 








as as 
i= , i = ith K = H . ei 
p Jg; Q JP, wi (5.88) 
We now define a new coordinate 
J= $ pida, (5.89) 


which is known as a phase integral, the integration of the momentum variable p; being 
taken over a complete cycle of the values of q;. In the case of Cartesian coordinates, it 
can be seen that J; has the dimensions of angular momentum and is known as an action 
variable, by analogy with Hamilton’s definition in his principle of least action. During one 
cycle the values of q; vary continuously between finite values from gmin tO Gmax and back 
again to qmin. AS a result, from the definition (5.89), the action J; is a constant. If q; does 
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not appear in the Hamiltonian, as occurs in the cases of interest to us, the variable is said to 











be cyclic. 
Accompanying J; there is a conjugate quantity w;, 
as 
p= è 5.90 
My, (5.90) 
The corresponding canonical equations are 
oK ; OK 
bi = , d=- ' 5.91 
37 dw, 6:21) 


where K is the transformed Hamiltonian. w; is referred to as an angle variable and so the 
motion of the particle is now described in terms of (J;, w;) or action—angle variables. If 
the transformed Hamiltonian K is independent of w;, it follows that 


w; = constant = v, J; =0, (5.92) 
where the v;s are constants. Therefore, 
w; = vit + yi, J; = constant. (5.93) 


The constants v; are associated with the frequency of motion. If we take the integral of 
w; round a complete cycle of q;, we find 











Ow; ad (os 
aw=ġ be = ) an. (5.94) 
dqr Iqr \ Odi 
Taking the differentiation outside the integral, 
hae ð f as da, = d Jy (5.95) 
N) aa” | 


the last equality resulting from the definition p = 0S/dq,. Thus, Aw; = 1, if i = k and 
Aw; = 0, ifi 4 k. In other words, each independent variable w; changes from 0 to 1 when 
the corresponding g; goes through a complete cycle. 

It is apparent from these considerations that action—angle variables are the ideal tools 
for studying the orbits of electrons in the old quantum theory, Sommerfeld’s ‘royal route 
for quantum problems’. The introduction of action—angle variables into quantum theory 
was pioneered by Schwarzschild and Epstein in the contexts to be described in Sect. 7.2 
(Schwarzschild, 1916; Epstein, 1916a). As remarked by Jammer (1989), 


*...it almost seemed as if Hamilton’s method had expressly been created for treating 
quantum mechanical problems.’ 


Let us give the example of Sommerfeld’s treatment of the Bohr model in three dimensions. 


5.5 Sommerfeld’s model of the atom in three dimensions 
O) 


The best way of illustrating how these procedures provide a natural set of mathematical 
tools for quantum problems is to repeat Sommerfeld’s analysis of the quantisation of angular 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:52:52 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781 139062060.006 
Cambridge Books Online © Cambridge University Press, 2014 





110 


Sommerfeld and Ehrenfest 


momentum in three dimensions. In spherical polar coordinates, the kinetic energy is 
T= u (F? +7? 6 +r? sin? 6 6?) . (5.96) 


From (5.61), the canonical momenta in (r, 0, &) coordinates are p, = mr, pp = mr? 6 and 
Po = mr? sin? 6 d. Therefore, the Hamiltonian is 


1 pe ps k 
H= 2428 : 5.97 
2m ( x r2 i r2 sin? 0 r 2 








where k = Ze? /4rrey. We can therefore immediately convert this equation into a Hamilton- 
Jacobi equation: 


1 aw? 1 (aw? 1 aw? | k 
H= =E, 5.98 
2m KZ) +32) Ya) | r ve) 


where we have written Hamilton’s function as W, rather than S.° The total energy associated 
with the Hamiltonian H is E and is a constant of the motion. We now write the function W 
in separable form, 








W = W,(r) + Wo(0) + Wal). (5.99) 
First, we consider the partial differential term in & in (5.98). This relation must be true for 
all @ and so 
ow 
—— = Qe = constant . (5.100) 
ag 
Therefore (5.98) becomes 


1| (aw)? ifaw a3 k 
=E. 5.101 
TOER r ( ) 


Now the term in curly brackets in (5.101) involves only 6 and so must also be a constant, 








nn = Bag (5.102) 
a0) simo °?" ' 
Replacing the term in curly brackets by «3, the Hamilton-Jacobi equation becomes 
IW\N? a k 
—)4+4=2m(E4+-). (5.103) 
or r? r 


The three equations (5.100), (5.102) and (5.103) are conservation equations for motion in 
the three independent coordinates (r, 0, &). The first (5.100) is simply the conservation of 
angular momentum about the fixed polar axis. There is also, however, angular momentum 
about the 0-axis. Conservation of the total angular momentum p is given by (5.102) as may 
be seen by comparing the expression for conservation of angular momentum for motion in 
a plane 


H= (+ 2)—*=e, (5.104) 
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with (5.97). The third equation (5.103) corresponds to the conservation of total energy. 
Now we convert these results to action—angle variables. These are defined by 











j= Jo =$ Poth = gz ee (5.105) 
a a, 
h= J = dm do = I5 —* ao, (5.106) 
ð s 
J; = J, =$r.u=$ 5 dr. (5.107) 
F 
Using (5.100), (5.102) and (5.103), these integrals can be written 
Jp = fas do, (5.108) 
2 
ae aes 
J = f rer) de, (5.109) 
2mk 2 
EL un (5.110) 
r r 
The first integral is trivial, 
Jg = nag . (5.111) 


The second integral is most simply evaluated by comparing the expressions for the kinetic 
energy in spherical polar and plane polar coordinates, as was done in the comparison of 
(5.104) with (5.97), 


Pr? + po + pod = pF + pý , (5.112) 


where p is the total angular momentum and wy is the azimuthal angle in the plane of the 
orbit. Hence, replacing pọ by pw — Pods the action integral for Jg can be written 


= $ pav- $ poog. (5.113) 
Taking both integrals round 27, we find 
J = 27 (p — po) = 27 (Qg — ag) . (5.114) 


The third integral for the radial action therefore becomes 


2mk (h+J 
s= G fome + Mk -AW r., (5.115) 
r 4r?r2 





On performing this integral, we find a relation between the energy of the system and the 
action integrals Jy, Jọ and J,. This calculation is carried out by Goldstein by contour 
integration with the final remarkable result 

2n’mk? 


H=E= : (5.116) 
(J. + Jo + Joy 
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This calculation has been entirely classical and the values of the action variables are 
continuous. Sommerfeld now applied his quantum conditions to the three action integrals 
(5.105), (5.106) and (5.107) with the result 


Jo = $ Poh = noh, (5.117) 
J = Q pod = noh, (5.118) 
J, = Pr d=n,h, (5.119) 


where ng, ng and n, are integers. Now only certain orientations of the orbits of the electrons 
in space are allowed and the energy levels are again degenerate in the non-relativistic 
limit. Inserting these values into (5.115) and setting k = Ze?/4zreo, the energies of the 
three-dimensional elliptical orbits are 
Zetm 
E= 5 ; (5.120) 
8e5h? (n, + no + ng)? 

exactly the same form of result as (5.23). 

Let us now work out the frequencies v; associated with each independent coordinate. 
From (5.91) and (5.92), 








(5.121) 


and so from (5.116), we find 


0H 0H 0H Ar’mk? 
oh dh Jy (J, to +p) ` 








(5.122) 


Thus, the frequencies are the same for all three independent coordinates, r, @ and &. 
These results are the basis of quantisation in the old quantum theory. The rules are as 
follows: 


e Quantisation in the ¢-coordinate corresponds to quantising the projection of the angular 
momentum onto some chosen axis, which we can choose to be the z-direction of a 
Cartesian coordinate system. The quantisation rule is J; = ngh/2n = mh/2n = mh, 
where m came to be known as the magnetic quantum number for reasons to be discussed 
later. 

The combined quantisation in the 0- and -coordinates is given by J = Je + Jọ = 
(ng +m)h. This can be written J = kh where k is known as the azimuthal quantum 
number and defines the quantised total angular momentum. 

The quantisation in the radial r direction is given by (5.116)-(5.119) and corresponds 
to J, = (n, + ng + m)h. The quantum number (n, + ng + m) defines the total energy of 
the orbit given by (5.120) and is known as the principal quantum number.® 


We have concentrated upon the non-relativistic case of hydrogen-like atoms in which 
the electron moves in a strictly inverse-square law electrostatic field of force. For other 
atoms, the field experienced by the electron is generally not an inverse-square law and 
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Illustrating the motion of an electron in the x—y plane when the frequencies of oscillation are different in the x- and 
y-directions (Sommerfeld, 1919). 


then the frequencies of oscillation of the independent coordinates need not be the same. 
In particular, they need not be related by the ratios of integers. In these cases, the overall 
motion of the electron is not simply periodic, although each of the coordinates undergoes 
simple periodic motion. When the associated frequencies are not all rational fractions of 
each other, the motion is referred to as conditionally periodic. In the two-dimensional case 
in Cartesian coordinates, conditionally periodic motion can be represented by Lissajou 
figures as illustrated in Fig. 5.4. The motion of the electron is represented by oscillations 
at slightly different frequencies in the x- and y-directions. If the ratio of frequencies is 
a ratio of integers, the loci will eventually repeat, but in the more general case in which 
the ratio can be irrational, the loci never join up and the loci will fill the complete x—y 
plane. 

These were very considerable triumphs and many of these features will reappear in a 
very different guise in the full quantum theory of Heisenberg and Schrödinger. In 1916, 
the tools of action—angle variables were clearly the way ahead and these had already 
been developed to a high degree of sophistication by the dynamical astronomers. These 
techniques flourished and became the preferred methods of the theoretical physicist. While 
the model of the hydrogen atom was a triumph, the classical Bohr-Sommerfeld model ran 
into almost insuperable difficulties with multi-electron systems. Fortunately, the techniques 
for dealing with small perturbations to planetary orbits, which would result in slow changes 
of conserved quantities as the system evolved, had been pioneered by the astronomers 
and so these tools were available. They were to be used to great effect in understanding 
the influence of electric and magnetic fields upon the Bohr-Sommerfeld model of the 
atom. 


5.6 Ehrenfest and the adiabatic principle 
—————z—z—z—z—z—z—z z z—z—z—z— zz ee 


Ehrenfest was no admirer of quantum theory. As he remarked in a letter to Lorentz of 
25 August 1913, 
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“Bohr’s work on the quantum theory of the Balmer formula (in the Philosophical Maga- 
zine) has driven me to despair. . . . If this is the way to reach the goal, I must give up doing 
physics.’ 


Again in May 1916, following Sommerfeld’s extension of the Bohr model, he wrote to 
Sommerfeld 


‘Even though I consider it horrible that this success will help the preliminary, but still 
incomplete monstrous Bohr model to new triumphs, I nevertheless heartily wish physics 
at Munich further success along this path.’ (Klein, 1970) 


Despite his reservations, Ehrenfest became deeply involved in issues in quantum physics 
and, in particular, introduced the concept of adiabatic invariance into the formalism of the 
old quantum theory. 

The seeds of this concept were already present in the discussions at the 1911 Solvay 
Conference. Lorentz raised the issue of whether or not a quantised pendulum whose string 
is being shortened remained in a quantised state. Without hesitation, Einstein replied, 


‘If the length of the pendulum is changed infinitely slowly, its energy remains hv if it was 
originally hv.’ (Einstein, 1912) 


This is an example of adiabatic invariance. Sommerfeld carried out this calculation in his 
book Atomic Spectra and Spectral Lines (1919). Let us repeat it here and clarify exactly 
what Einstein meant. The pendulum has length / and the bob mass m. The instantaneous 
angle from the vertical direction is ¢, the maximum amplitude do being assumed to be 
small. The angular frequency of oscillation of the pendulum is then wọ = 27 vo = Yg/l. 
The tension S in the string is the sum of the gravitational and centripetal forces acting on 
the bob and so 


S = mg cos +mld*. (5.123) 


Now the length of the string is shortened very slowly so that there are many swings of the 
pendulum as its length decreases by d/. The average work done in shortening the string by 
di is 


dW = S|dl| = -mg cos dl — ml $? di , (5.124) 


where the bars indicate averages taken over a period of the pendulum swing. The signs are 
negative since the increment of length d/ is negative. Taking the motion of the pendulum 
to be 








= po sin(wot + y), (5.125) 
we find 
l-z 1 — do) go 
=1--@=1-7¢, ==. 2 
cos@ = 1 zÊ 1 29» $ 7 7] (5.126) 
Therefore, 
1 2 2 
dW =- [rs (1 #) + | dl = me (1 + 2) di. (5.127) 
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The first term in dW, —mgdl, is the work needed to raise the mean position of the bob by 
d/ while the second represents the increase in the average kinetic energy of motion of the 
pendulum. The total energy is, as usual, twice the mean kinetic energy, E = 2Exin, where 


— 2 
mea Pe gt, (5.128) 
2 4 
Differentiating this expression, the change in total energy is 
zont 
dE = mg, di + mgldo doo . (5.129) 
Equating (5.129) to the work done in increasing the motion of the pendulum, the term 
—mg¢jdl/4 from (5.127), we find 


39 di =1ddo , (5.130) 


and so, integrating, 


3 
7 log! = — log ġo + constant, 17/49 = constant . (5.131) 


Thus, as the string of the pendulum shortens, its angular amplitude increases, although 
the physical amplitude of the swing decreases, Ax = fol œ I!/*. Since g is a constant, the 
frequency of oscillation increases as w œ /~!/* We can now work out the relation between 
the total energy of oscillation of the pendulum and its frequency from (5.128). We obtain 
the key result 


= constant . (5.132) 


=| | 


This is what Einstein meant in his response to Lorentz. 

The quantity E/v is an example of an adiabatic invariant. Adiabatic invariance had been 
in the literature in various guises since the understanding of the first and second laws of 
thermodynamics in the mid-1850s. In classical thermodynamics, adiabatic processes play 
a central role in defining the underlying structure of the theory. To emphasise this point, 
let me repeat the description of what is involved in reversible adiabatic processes from my 
treatment in Theoretical Concepts in Physics (Longair, 2003). 


‘A reversible process is one which is carried out infinitely slowly so that, in passing from 
the state A to state B, the system passes through an infinite number of equilibrium states. 
Since the process takes place infinitely slowly, there is no friction or turbulence and no 
sound waves are generated. At no stage are there unbalanced forces. At each stage, we 
make only an infinitesimal change to the system. The implication is that, by reversing the 
process precisely, we can get back to the point from which we started and nothing will 
have changed in either the system or its surroundings. ... 

Let us emphasise this point by considering in detail how we could carry out a reversible 
isothermal expansion. Suppose we have a large heat reservoir at temperature T and a 
cylinder with gas in thermal contact with it also at temperature T. No heat flows if the 
two are at the same temperature. But if we make an infinitesimally small movement of the 
piston outwards, the gas in the cylinder cools infinitesimally and so an infinitesimal amount 
of heat flows into the gas by virtue of the temperature difference. This small amount of 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:52:52 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781 139062060.006 
Cambridge Books Online © Cambridge University Press, 2014 





116 


Sommerfeld and Ehrenfest 


energy brings the gas back to T. The system is reversible because, if we compress the gas 
at T slightly, it heats up and heat flows from the gas into the reservoir. Thus, provided we 
consider only infinitesimal changes, the heat flow process occurs reversibly. 

Clearly, this is not possible if the reservoir and the piston are at different temperatures. 
In this case, we cannot reverse the direction of heat flow by making an infinitesimal change 
in the temperature of the cooler object. This makes the important point that, in reversible 
processes, the system must be able to evolve from one state to another by passing through 
an infinite set of equilibrium states which we join together by infinitesimal increments of 
work and energy flow. 

To reinforce this point, let us repeat the argument for an adiabatic expansion. The 
cylinder is completely thermally isolated from the rest of the Universe. Again, we perform 
each step infinitesimally slowly. There is no flow of heat in or out of the system and there 
is no friction. Therefore, since each infinitesimal step is reversible, we can perform the 
whole expansion by adding lots of them together.’ 


In the case of a reversible adiabatic expansion, the pressure and volume are related 
by the expression pV” = constant, where y is the ratio of the specific heat capacities 
at constant pressure and constant volume respectively, y = C,/Cy. In terms of pressure 
and temperature, pTY/(Y-V = constant. At all stages in the expansion, the system is in 
thermal equilibrium at temperature T and so the total energy of the system is E = Ve, 
where € is the energy density of the gas. The relation between pressure and internal energy 
density is p = (y — l)e and so, since pV = RT, the relation between the energy E and 
the temperature 7 for one mole of gas is 


RT > E 
E = ——., thatis, = = constant. (5.133) 

y-1 T 
This is a further example of an adiabatic invariant. But, we have seen this before. The 
derivation of Wien’s displacement law described in Sect. 1.7.2 involved the use of two 
adiabatic invariants. From (1.31) and (1.32), we find that the total energy E = Ve, the 


frequency v and the temperature T are related by the invariants 
E/v = constant, v/T = constant andso E/T = constant. (5.134) 


Ehrenfest’s interest in adiabatic invariance resulted from his perplexity about Planck’s 
and Einstein’s derivation of the form of the Planck spectrum in which classical and quantum 
concepts were mixed together. Ehrenfest’s concern was that the Stefan—Boltzmann law and 
Wien’s displacement law are derived by purely classical arguments as we demonstrated in 
Sects. 1.7.1 and 1.7.2 and yet the highly non-classical procedure of quantisation had to be 
included to derive the correct form of the black-body spectrum. By 1913, Ehrenfest had 
resolved the conundrum when he realised that the quantities which were quantised were 
in fact adiabatic invariants (Ehrenfest, 1913). As shown by (5.134), E /v is an invariant in 
the adiabatic expansion of a gas of radiation. If the rule of quantisation is applied to this 
adiabatic invariant, we immediately find 


E/v=nh where n=O, 1, 2,... (5.135) 


Thus, the rule of quantisation is preserved during the adiabatic expansion from T to 7). 
Besides resolving Ehrenfest’s concern, it provided a new approach to the issue of which 
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quantities were to be subject to the rules of quantisation. At the time, little attention was 
paid to Ehrenfest’s ideas, but Bohr and Einstein recognised its importance. Einstein called 
this approach to quantisation Ehrenfest’s adiabatic hypothesis, namely 


‘If a system be affected in a reversible adiabatic way, allowed motions are transformed 
into allowed motions.’ 


In his paper of 1916, Ehrenfest stated, 


*...the hypothesis gives restrictions to the arbitrariness which exists otherwise in the 
introduction of the quanta.’ (Ehrenfest, 1916) 


This paper of 1916 presented a much more general and abstract definition of the concept 
of adiabatic invariance. In Ehrenfest’s words, 


‘Let the coordinates of the system be denoted by q1, ..., qn. The potential energy ® may 
contain besides the coordinates q certain ‘parameters’ aı,a2..., the values of which 
can be altered infinitely slowly. The kinetic energy T may be a homogeneous quadratic 
function of the velocities ġ1, . . . , Gn, the coefficients of which are functions of q and may 
be of a), a2,.... By changing the parameters from the values a1, a2, ... to ay. a}, ... in 
an infinitely slow way, a given motion ß(a) is transformed into another motion ß(a’). 
This special type of influencing upon the system may be called “a reversible adiabatic 
affection”, the motions B(a) and B(a’), “adiabatically related to each other?” 


This last statement means that the process must be a reversible adiabatic change. 

This hypothesis immediately accounted for one of the remarkable results of Sommerfeld’s 
development of the Bohr model of the atom, namely that the period of the quantised circular 
and elliptical orbits of the same principal quantum number had the same energies and 
frequencies. It could be shown that the model with circular orbits is adiabatically related, in 
Ehrenfest’s sense, to elliptical orbits of the same frequency. Consequently, since E/v is an 
adiabatic invariant and equal to nh, it follows that the orbits must have the same energies. 

The final step in the argument concerned the appropriate sets of coordinates which 
should be employed in applying the quantum conditions. The answer came from the studies 
of Schwarzschild and Epstein whose papers were concerned with the Stark effect, but 
which contained within them the correct answer (Schwarzschild, 1916; Epstein, 1916a). 
The answer was that, for systems of dynamical equations which were separable, the quantum 
conditions should be applied in the action—angle coordinates discussed in Sects. 5.4.4 and 
5.5. Specifically, they showed that the action variables J are the quantities to be identified 
with adiabatic invariants and that these lead directly to Sommerfeld’s quantum conditions 
(5.117)-(5.119). It is now clear why Planck referred to h as the quantum of action. 

These results led to a great deal of mathematical study of the conditions under which the 
dynamical equations would be separable and also to the issue of the degeneracy of the orbits 
in the non-relativistic Bohr-Sommerfeld model. Burgers, a student and then collaborator 
of Ehrenfest’s at Leiden, went on to study the conditions under which the quantisation 
rules should be applied to systems in which the Hamiltonian is time-variable, that is, the 
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time-dependent version of the Hamilton-Jacobi equation (5.87), which is 


aS os as as 
, gares „t| + 
ðqı 3q ðqn at 





H (ma. =0. (5.136) 


The conditions are not dissimilar from those for the time-independent equation and in 
addition he showed that the action variables J are adiabatic invariants (Burgers, 1916). 
These analyses provided the framework for the more detailed study of quantum phenomena, 
in particular the Zeeman and Stark effects which are discussed in Chap. 7. 


5.7 The developing infrastructure of quantum theory 


After only a few years, Bohr’s dramatic insight into the nature of quantum phenomena in 
atoms and molecules had been placed on a much more secure theoretical foundation. The 
apparent arbitrariness of the quantum conditions had been replaced by the understanding that 
the quantities which had to be quantised were the adiabatic invariants of classical physics. 
Furthermore, the appropriate mathematical tools for treating quantised systems had been 
established — the action—angle variables of the transformation theory of mechanics, which 
were derived from Hamiltonian mechanics and dynamics. This mathematical technique, 
which had been developed for applications in celestial mechanics, suddenly became the 
tool of choice of theorists for the study of quantum problems. This brought with it the 
full apparatus of higher mechanics which was to become the bread and butter of the 
theoretical physicist. These developments had a long-range impact upon quantum theory 
since many of the mathematical techniques were to find application in somewhat different 
guises when Heisenberg’s and Schrödinger’s versions of the theory of quantum processes 
were enunciated. 

The exploitation of this newly won understanding was the immediate goal of the theorists 
who still had a long way to go before the theory could account for the details of quantum 
phenomena. 
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principle and the first selection rules 


6.1 The problem of transitions between stationary states 





The understanding of adiabatic invariants as the quantities which are subject to the Bohr— 
Sommerfeld quantisation conditions and their mathematical description through Hamilton— 
Jacobi theory and action-angle variables provided the tools for determining the energies 
of stationary states of quantum systems and hence the frequencies of the radiation emitted 
in transitions between them. The theory had, however, nothing to say about the phys- 
ical processes by which the transitions take place nor about the intensities and polar- 
isations of the resulting radiation. Bohr proposed to address these issues through his 
correspondence principle. In simple terms, the principle states that in the limit of large 
quantum numbers, when An < n, the radiation processes between stationary states should 
approach the classical results derived from Maxwell’s theory of electromagnetic radia- 
tion and so provide information about the intensity and polarisation properties of the 
radiation. 

This correspondence principle had already been foreshadowed in Planck’s theory of 
the spectrum of black-body radiation of 1900. The expression for the equilibrium energy 
density of radiation in an enclosure at temperature T, derived in Sect. 2.6, is 


8rhv? 1 


WY) = — ae 


(6.1) 


In Planck’s derivation of this result, it was assumed that the energy levels of the oscillators 
are quantised for the emission of radiation throughout the electromagnetic spectrum and 
yet, at frequencies hv < kT, (6.1) reduces to the classical formula, the Rayleigh-Jeans 
law, 


8x hv3 1 87 v2 
u) oe AT. (6.2) 





This expression was derived by Rayleigh using purely classical arguments, as shown in 
Sect. 2.3.4. This result was to be elaborated by Bohr into his correspondence principle, but 
before that Einstein made a crucial advance in tackling the issue of quantum transitions 
from a probabilistic perspective. 
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6.2 On the quantum theory of radiation (Einstein 1916) 


During the years 1911-1916, Einstein was fully preoccupied with the formulation of general 
relativity, one of the most remarkable intellectual achievements in the history of physics. 
Following the 1911 Solvay Conference, he wrote relatively little about quantum physics 
until he returned to the origin of the spectrum of black-body radiation in 1916. As recounted 
in Chaps. 4 and 5, by that time, the general opinion had shifted in favour of the view that 
quanta and quantum theory had to be taken seriously in understanding the physics of atoms. 

It is important to recall that Einstein’s proposal of 1905 that light is composed of discrete 
quanta was significantly more revolutionary than Planck’s introduction of quantisation 
which only applied to the sources of the radiation. The quotations from the writings of 
Planck in 1907 and Lorentz in 1909 in Sect. 3.6 indicate that they preferred to think of the 
emission, absorption and propagation of radiation strictly in terms of Maxwell’s classical 
electromagnetic theory. In contrast, Einstein never deviated from his belief in the reality 
of light quanta. His paper on fluctuations in the intensity of black-body radiation described 
in Sect. 3.6 was an example of this continuing pursuit. The paper of 1916 was a further 
contribution to his crusade to convince his colleagues of the reality of light quanta. The paper 
is best remembered today for its introduction of what are now known as Einstein ’s A and B 
coefficients but it also introduces concepts which were to be important in the development 
of the theory of quanta, particularly the introduction of transition probabilities. 

Following Bohr’s pioneering efforts of 1913, the emphasis of quantum physics shifted 
to the understanding of the details of atomic spectra as discussed in Chaps.4 and 5. 
Let us look at Einstein’s paper of 1916 in a little detail (Einstein, 1916). He begins by 
noting the formal similarity between the Maxwell—Boltzmann distribution for the velocity 
distribution of the molecules in a gas and Planck’s formula for the black-body spectrum. 
Einstein shows how these distributions can be reconciled through his new derivation of the 
Planck spectrum, which gives insight into what he refers to as the ‘still unclear processes 
of emission and absorption of radiation by matter.’ The paper begins with a description of 
a quantum system consisting of a large number of molecules which can occupy a discrete 
set of states Z1, Z2, Z3,... with corresponding energies £1, €2, €3,.... According to 
classical statistical mechanics, the relative probabilities W, of these states being occupied 
in thermodynamic equilibrium at temperature 7 are given by Boltzmann’s relation 


W, = g, oxp (-) , (6.3) 


where the g, are the statistical weights, or degeneracies, of the states Z,, meaning the 
number of states with exactly the same energy ¢,. As Einstein remarks in his paper 


‘[(6.3)] expresses the farthest-reaching generalisation of Maxwell’s velocity distribution 
law.’ 


Consider two quantum states of the gas molecules, Zm and Z, with energies €m and 
En respectively, such that €m > €n. Bohr’s model associates the frequency of the radiation 
emitted in such a transition as hv = €m — €n, but the mechanism by which this takes place is 
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not specified. According to Einstein’s picture, the transition is associated with the emission 
of a quantum of radiation of energy hv, which for convenience we will refer to as a photon.! 
Similarly, when a photon of energy hv is absorbed, the molecule changes from the state 
Zn to Zm. Einstein’s aim was to develop a purely quantum approach to the emission and 
absorption of radiation. 

The quantum description of these processes follows by analogy with the classical pro- 
cesses of the emission and absorption of radiation — Jammer (1989) remarks that this is an 
early manifestation of what was to be embodied in Bohr’s correspondence principle. 


e Induced emission and absorption By analogy with the classical case, if an oscillator is 
excited by waves of the same frequency v as the oscillator, it either gains or loses energy, 
depending upon the phase of the wave relative to that of the oscillator, that is, the work 
done on the oscillator can be either positive or negative. The magnitude of the positive 
or negative work done is proportional to the energy density u of the incident waves with 
frequency v. The quantum mechanical equivalents of these processes are those of induced 
absorption, in which the molecule absorbs the photon and is consequently excited from 
the state Z, to Zm, and induced emission, in which the molecule emits a photon under the 
influence of the incident radiation field. The probabilities of these processes are written 


Induced absorption dW = Brudt, 
Induced emission dW = Brudt. 


The lower indices refer to the initial state and the upper indices to the final state. B” and 
By, are constants for a particular pair of energy states, and are referred to as coefficients 
associated with ‘changes of state by induced absorption and emission’. 

e Spontaneous emission Einstein noted that an electric dipole oscillator emits radiation 
‘spontaneously’ in the absence of excitation by an external field. The corresponding 
process at the quantum level is called spontaneous emission, the probability of the 
emission of a photon taking place in the time interval dt without external causes being 

dW = A’ dt. (6.4) 


m 


Here Einstein used the analogy with the radioactive decay law, N = No exp(-et), in 
which the nucleus decays spontaneously. This can be interpreted in terms of a probability 
of a decay occurring in the time interval dt by the rule p(t) dt = a dt. He remarked, 


‘One can hardly think of it in any other way except as a radioactive reaction.’ 


We now seek the spectrum of the energy density of radiation u(v) in thermal equilibrium. 
The relative numbers of molecules with energies €m and £, in thermal equilibrium are given 
by the Boltzmann relation (6.3) and so, in order to leave the equilibrium distribution un- 
changed under the processes of spontaneous and induced emission and induced absorption 
of radiation, the probabilities must balance, that is, 


—é,/kT pm, _ —Em/ kT n n 
gne ol KT B” u = gme HT (Btu + A”). (6.5) 
——— a a 

absorption emission 
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In the limit 7 — oo, the radiation energy density u — oo, and the induced processes 
dominate the equilibrium. Allowing T — 00, A}, < B?” u and so (6.5) becomes 


EnB? = gm Bi. (6.6) 
Reorganising (6.5), the equilibrium radiation spectrum u can be written 


Al Bn 
u= (6.7) 


(Sat) 
exp TT —1 


But, this is Planck’s radiation law. Einstein had already demonstrated in 1905 that, in the 
Wien limit hv > kT, light can be considered to consist of a gas of photons. In that limit, 


An, Em — En 3 hv (6.8) 
= X Xv ex = ä s 
“= pn PP kT Pl EF 


m 








Therefore, we find the following relations 
xv, Em—Er=hv. (6.9) 


The value of the constant in (6.8) can be found from the Rayleigh—Jeans limit of the 
black-body spectrum, £m — €,/kT < 1. From (6.2), it follows that 


87 v2 A” kT An, Sırhv’ 
u(v) = a kT = Br, Jiv and so Be = gez ; 





(6.10) 


The A” and B” coefficients are associated with atomic processes at the microscopic level. 
m m 


Once A”, or BY or B? is known, the other coefficients can be found from (6.6) and (6.10) 
immediately. Einstein wrote exuberantly to his friend Michele Besso on 11 August 1916, 


‘A splendid flash came to me concerning the absorption and emission of radiation... A 
surprisingly simple derivation of Planck’s formula, I would say the derivation. Everything 
completely quantum.’ 


This important analysis occupies only the first three sections of Einstein’s paper. The 
remainder of the paper concerns the transfer of momentum as well as energy between 
matter and radiation. We will not go through the details of the argument, but summarise its 
physical content. According to standard kinetic theory, when molecules collide in a gas in 
thermal equilibrium, there are fluctuations in the momentum transfer between molecules 
which amounts to 

= =2RKT, (6.11) 

T 

where A is the momentum transfer to a molecule during a short time interval t and R is a 
constant related to the ‘frictional’ force acting upon the moving molecules. Note the striking 
similarity with Einstein’s famous formula (3.1) for the diffusion of microscopic particles 
undergoing Brownian motion. Now suppose the density of atoms is reduced to an extremely 
low value so that the momentum transfer is dominated by collisions between photons and 
the few remaining atoms, which can be considered collisionless. According to Einstein’s 
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6.3 Bohr's correspondence principle 


quantum hypothesis, the photons have energy hv and momentum hv/c. Therefore, the 
particles should be brought into thermal equilibrium at temperature T entirely through 
collisions between photons and particles. This was the reason that Einstein needed his 
equations for spontaneous emission and induced absorption and emission of radiation 
since these determine the transfer of energy and, more important in the present context, 
momentum between the particles and the radiation. In the rest of the paper of 1916, Einstein 
showed that, assuming the momentum transfers occur randomly in directional collisions 
between photons and electrons, the variance of the fluctuations in the momentum transfer 
is exactly the same expression as (6.11). This could not happen if the energy re-radiated 
by the particles was isotropic because then there would be no random component in the 
momentum transfer process. The key result was that, when a molecule emits or absorbs a 
quantum Av, there must be a positive or negative change in the momentum of the molecule 
of magnitude |Av/c|, even in the case of spontaneous emission. In Einstein’s words, 


“Outgoing radiation in the form of spherical waves does not exist. During the elementary 
process of radiative loss, the molecule suffers a recoil of magnitude hv/c in a direction 
which is determined only by “chance”, according to the present state of the theory.’ 


His view of the importance of the calculation is summarised in the version of the paper 
published in 1917. 


‘The most important thing seems to me to be the momenta transferred to the molecule 
[atom] in the processes of absorption and emission. If any of our assumptions concerning 
the transferred momenta were changed [(6.11)] would be violated. It hardly seems possible 
to reach agreement with this relation, which is demanded by the [kinetic] theory of heat, 
in any other way than on the basis of our assumption.’ 


Direct experimental evidence for the correctness of Einstein’s conclusion was only obtained 
in 1923 from Compton’s X-ray scattering experiments (Compton, 1923). These showed 
that photons undergo collisions in which they behave like particles, the Compton effect 
or Compton scattering. It is a standard result of special relativity that the increase in 
wavelength of a photon in collision with a stationary electron is 


ei hc 





„(1 — cos@), (6.12) 

Mec 
where @ is the angle through which the photon is scattered.” Implicit in this calculation is 
the conservation of relativistic three-momentum in which the momentum of the photon is 
hv/c. 


6.3 Bohr’s correspondence principle 
ZZ Z— —————————==——a—--—— 


Einstein’s introduction ofthe concept of spontaneous and induced emission processes made 
a profound impression upon Bohr who saw a way of relating the spontaneous transition 
probabilities A}, to classical electrodynamics through application of what became known as 
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the correspondence principle. Allusion has already been made in Sect. 6.1 to the result that 
the Planck spectrum reduces to the classical Rayleigh—Jeans formula in the limit of very low 
frequencies. In addition, Bohr had shown that the same equivalence applied to transitions 
between states of very large quantum numbers n for which the emitted frequencies are also 
very low. Let us first demonstrate this result for the Bohr model of the hydrogen atom with 
circular orbits. 
We showed in Sect. 4.5 that the kinetic energy of an electron in a state with principal 
quantum number n is 
1 2 m.et 


T = <mev 


a 6.13 
2 Segn?h? en) 


from which it follows that the velocity of the electron in its circular orbit is given by 








4 
2 _ (6.14) 
enh ` 
The frequency of rotation of the electron about the nucleus ve is 
v v? mev? T (6.15) 
Ve = = = = ; À 
2ar 2nvrr 2nJ xJ 
where J = nh/2z is the quantised angular momentum of the electron. Therefore, 
4 
m,e 
Ve = ———. 6.16 
4eĉh?n? 12) 


Let us now work out the frequency of emission of radiative transitions between stationary 
states with large values of n, the principal quantum number according to the Bohr picture. 
From (4.28), we find 





meet 1 1 (6.17) 
v= : : 
Seh? \n? (n+ An)? 
where An < n is the difference in principal quantum number between the upper and lower 
states. We can therefore carry out a Taylor expansion of the term in (n + An)~? and find 


that 


mee* 


v= — . 6.18 
4eih?n3 e (ore 


Now, An = 1, 2,... and so we find that the frequency of rotation (6.16) is exactly equal to 
the emission frequency of the transition for An = 1, v = ve. Thus, in this simplest case, we 
see that there is an exact correspondence between the frequency of rotation of the electron 
in its orbit about the nucleus, which classically gives rise to electromagnetic radiation 
at the orbital frequency of the electron, and the frequency associated with the difference 
in energies of the quantised stationary states of the electron. This is the feature which 
Bohr exploited in his correspondence principle. His proposal was to use the theory of the 
electromagnetic radiation of the orbiting electron according to Maxwell’s electromagnetic 
theory to determine the intensity and polarisation properties of the emission lines according 
to quantum theory. 
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Notice another important feature of (6.18). The frequencies of lines with An = 
2, 3, 4,...are exact harmonics of the orbital frequency ve, in other words, v = T Ve, where 
t = An takes the values t = 2, 3, 4,.... This will prove to be an important result in the 
transition from classical to quantum mechanics (Sect. 10.3). 

The above argument can be readily extended to more general orbits using the action— 
angle variables introduced in Sects. 5.4 and 5.5. It is immediately apparent from (5.122) 
that the quantum frequency v for non-relativistic elliptical orbits for hydrogen-like atoms 
follows exactly the same quantum rules as for circular orbits, where the principal quantum 
number is (n, + ng + nọ) in the notation of Sect. 5.5. In his major paper of 1918, Bohr 
established this equivalence in the following way (Bohr, 1918a). According to the formalism 
of action—angle variables, (5.121) states that the frequency associated with the k coordinate 
of a periodic, or more generally conditionally periodic, system is 


_ 0H OE 
Oh Oi” 
where J; is the action variable associated with the k coordinate and H the Hamiltonian, 
which is just the total energy E. This can be compared with the quantum expressions 


vr (6.19) 


h= AE, AJ% = (n -noh = th, (6.20) 


where nj, and n, are the principal quantum numbers describing the states before and after 
the transition and r; takes the values 1, 2,.... In the limit of large quantum numbers, in 
which the energy differences between neighbouring states are small, we can replace the 
differences in E and Jy between states by their partial differentials, that is, 


AE _ ðE 
Ah 9 
This relation encapsulates Bohr’s correspondence principle, the left-hand side being quan- 
tum and the right-hand side classical. Therefore, 
AE dE 
hva = AE = — Ahr | — | Ah = Ttvrh . (6.22) 


(6.21) 


Ad a Jy 


In other words, vg = Tvg. For the case t = 1, we recover the exact equivalence between 
the classical and the quantum results. 

There is also an equivalence for higher harmonics of v; in the classical theory as well. 
The values of t4 > 1 correspond to harmonics of the fundamental frequency 1, and this 
has a counterpart in the classical theory of the emission of electromagnetic radiation of an 
electron in orbit about the nucleus at frequency v. The necessary tools have already been 
described in Sect. 2.3.1. In particular, from (2.3), the average rate of radiation of a dipole 
of dipole moment po = exo oscillating at angular frequency wo is 


-(5) B oge xe 2 wp? l (6.23) 
average 





dr ~ l2me 12negc3 


Therefore, the procedure is to decompose the periodic motion of the electron in its orbit 
into a complete set of orthogonal dipoles and this is achieved by describing the motion in 
terms of Fourier series. 
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Consider first a periodic system of one degree of freedom. Then, the Bohr-Sommerfeld 
quantisation condition for the system is 


fpa =nh. (6.24) 


The displacement x of the charged particle in its periodic, but not necessarily harmonic, 
motion can be expressed as a Fourier series, 


x= > C, cos2m(twt+c;), (6.25) 


where C, and c, are constants and the sum extends over all integral values oft = 1, 2,.... 
w is the frequency associated with the periodic motion of the electron about the nucleus.° 
In this one-dimensional example, the quantities C, are the amplitudes of oscillating dipoles 
with frequencies w, 2w, 3%, ... Thus, according to classical electrodynamics, the particle 
emits a series of lines with frequencies w, 2w, ... and the intensity of each line is given by 
(6.23) with the dipole moment po being given by D, = eC,. In other words, the intensities 
of the lines are determined by the squares of the absolute values of the Fourier components 
|D,|?. Bohr concluded 


“We must therefore expect that for large values ofn, these coefficients will on the quantum 
theory determine the probability of spontaneous transition from a given stationary state 
for which n = n’ to a neighbouring state for which n = n” = n’ — t. (Bohr, 1918a) 


Bohr had already noted that this result would be obtained in the case of an electron 
moving in an elliptical orbit about the nucleus: 


“The possibility of an emission of a radiation of such a frequency may also be interpreted 
from analogy with ordinary electrodynamics, as an electron rotating around a nucleus 
in an elliptical orbit will emit a radiation which according to Fourier’s theorem can be 
resolved into homogeneous components, the frequencies of which are tw, if w is the 
frequency of revolution of the electron.’ (Bohr, 1913b) 


This was only the beginning of the story. In general, the orbits of the electrons will 
not be simple ellipses because the electric field is not a purely 1/r potential when other 
electrons are present and so the motion is conditionally periodic rather than periodic. In 
two dimensions, the motion can be visualised as shown in Fig. 5.4. Therefore, we need the 
general k-dimensional Fourier series for general conditionally periodic motion. In Bohr’s 
notation, the coordinates of the conditionally periodic system are q1, q2, ..-, qs and so the 
displacement & of the particles in any direction can be expressed as a function of time by 
what Bohr calls an ‘s-double infinite Fourier series’ of the form 


E= 5 Cr, t,...,7, COS 2T[(T1@ +++: + Tsws)t + Cr oanl (6.26) 


jan 


where the sum extends over all positive and negative values of tẹ and each w is the 
mean frequency of oscillation of each independent coordinate qx. The values of Ca, z,,...,1, 
depend upon the constants of motion a; derived from the Hamilton-Jacobi equations, such 
as those described by (5.100)-(5.103). Notice that, unlike the degenerate elliptical orbits, in 
general, w; # wa #--- Æ ws, and so the ws are not simply related to each other by ratios of 
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integers. Notice also that the Fourier series contains not only harmonics of each fundamental 
frequency wz, but also ‘cross-terms’ (T;@; + T;w;) which correspond to coupling between 
independent modes of oscillation. 

Bohr’s insight was that the spontaneous transition probability A”, should be identified 
with the dipole moments of the corresponding Fourier components of (6.26). Thus, if we 
write the dipole moment associated with the transition n > m as Dam = eCnm; We 
equate the classical radiation formula (6.23) to the spontaneous emission luminosity of a 
single electron transition, 





4 2 
A” hy = a (6.27) 
Therefore, 
wo 
Ay = Sech (Drsni (6.28) 


Strictly speaking, these correspondences should only apply for transitions An between 
stationary states with large values ofn such that An >> n. Bohr, however, had little hesitation 
in applying the correspondence principle to small values of n as well. In addition to 
interpreting the intensity and polarisation properties of the emissions lines, the principle 
also led to the concept of selection rules. 


6.4 The first selection rules 
E 


The expression (6.26) indicates that there are very large numbers of stationary states and 
correspondingly an even greater number of possible transitions, but not all of these are 
realised in nature. In his paper of 1918, Bohr enunciated the first selection rules which 
would limit the number of permitted transitions between stationary states. Clearly, if the 
Fourier component D,,-,, in (6.28) is zero, the probability of the transition is zero and the 
transition is forbidden for such electric dipole transitions. 

Let us consider first the case of a one-dimensional harmonic oscillator. Because the 
system is periodic, we can simplify the Fourier series (6.26) by writing 


i 
P=5 X Dr exp (itaof) , (6.29) 


T=—0CO 
where the factor 1/2 is included since the summation spans the range —oo < tT < +00. 
The time variation of the dipole moment of the oscillator is harmonic and so 


p = exo COS Wot = > (ot + com) . (6.30) 


To find the Fourier components we multiply both sides by exp(—it’ wot) and integrate over 
one cycle of the oscillation. The only Fourier components which are non-zero are those 
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corresponding to t = 1 and t = —1, 
1 
Zee fr =+l, and D,=0 if tr#El. (6.31) 


Therefore, the expression for the spontaneous transition probability is 


3 








w 
Au en, 6.32 
en‘ nn 
with the selection rule An = +1. Since the quantised energy levels of the harmonic oscil- 
lator are 
1 
E, = nhv = smear (6.33) 
using the relation wọ = 27 vo, the transition probability can also be written 
2 
a =E a, (6.34) 
6T Eome? 


These are rather remarkable properties of the quantised harmonic oscillator. It can be 
seen that, because of the equal spacings between the energy levels and the selection rule 
An = +1, for all transitions, the frequency of the emitted radiation is exactly equal to that of 
the oscillator for all permitted transitions. Notice also that although Bohr’s correspondence 
principle was intended only to provide an estimate of the radiation rate for high energy 
levels, the results hold true for transitions between all energy levels of the harmonic 
oscillator. As remarked by Jammer (1989), these remarkable features of the quantised 
harmonic oscillator are fortunate features of the early conceptual development of quantum 
theory. In his pioneering investigations of 1900, Planck had assumed that the quantisation of 
the harmonic oscillator resulted in equal energy differences between the energy levels and 
that these differences corresponded to the frequency of the oscillator through the relations 
AE = e = hv and v = vo. Notice also that this equally spaced quantum ‘ladder’ is found 
because the oscillation is harmonic and so derived from a harmonic potential. 

The same procedure was adopted by Bohr for transitions between the stationary states of 
atoms. The transition probabilities are associated with the amplitudes of the components 
of the Fourier series, but now the transitions in each of the r, 6 and ¢ coordinates had to be 
considered. Let us consider the simplest case of hydrogen-like atoms in which the orbits of 
the electrons are ellipses, the periods of electrons with the same principal quantum number 
n being fixed. The solution of this problem was discussed in Sect. 5.5, the energy levels 
being given by (5.120) and the frequencies associated with the three orthogonal coordinates 
r, 0 and ¢ given by (5.122) being all the same, namely, 





Z?e*'m An?mk? 


E= , Vy = W = WW . 
8e2h2(n, + no + ng) een 





(6.35) 


Bohr pointed out in his paper of 1918 that in this case, the quantisation rules derived for 
the harmonic oscillator also apply since there is only a single frequency associated with 
the transition, just as in (6.30). Therefore similar selection rules apply, but now all three 
quantum numbers need to be considered. 
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6.5 The polarisation of quantised radiation 


The simplest approach is that presented by ter Haar (1967). The dipole moment can be 
resolved into components along the x-, y- and z-directions as follows: 


P,=ercosdsin®, (6.36) 
P,=ersindsin®, (6.37) 
P, = er cos@ , (6.38) 


where (r, 0, &) are standard spherical polar coordinates. The terms in 0 and & can be 
written in the form e*” and e*'®. For the case of hydrogen-like atoms, these factors as a 
function of time ¢ can be written in the form 


ell) = 3. A, etitost x erst (6.39) 




















T 
ee = > B. etitoot & tiot : (6.40) 
T 


where A, and B, can be written in the same form of Fourier series as (6.29) and wg = 
20 Vg = @ = 20 va = 2r v,. Just as in the case of the harmonic oscillator, the only terms 
of the Fourier expansions which are non-zero are those with t = +1 and so the selection 
rule for the quantum number in the 6-coordinate is 








Ang = 1. (6.41) 


A similar result applies for the ø-component, but in addition, since there is no dependence 
of p- on ¢, vg = 0 and so the selection rule Ang = 0 is also allowed. Therefore, for the 
g-component, the quantisation conditions are: 


Ang = 0, +1. (6.42) 


Bohr went on to show that these quantisation rules also apply to the case in which the 
frequencies v,, vg and vg are different, the case of conditionally periodic orbits. Thus, 
by using the correspondence principle, Bohr was able to reduce the number of possible 
transitions between stationary states to those which satisfied the above selection rules for 
electric dipole transitions. 


6.5 The polarisation of quantised radiation and selection rules 
eS | 


Further insight into the origin of the selection rules came from considerations of the 
polarisation properties of the radiation emitted in quantum transitions. The basic premise 
of the Bohr-Sommerfeld model of atoms is that the angular momenta of stationary states 
are quantised in units of h /27 . In the simplest model with circular orbits, when a transition 
occurs, there is a change in the energy of the electron in the atom hv and a corresponding 
change in angular momentum h/2rr. The angular momentum is transferred to the emitted 
radiation. Bohr and Sommerfeld appreciated that since the radiation is quantised, there must 
be selection rules which determine which transitions between stationary states can result in 
electric dipole radiation. 
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The application of the correspondence principle to the conservation of angular mo- 
mentum in the emission of radiation was worked out by Sommerfeld’s assistant Adalbert 
Rubinowicz and published in an important paper of 1918 (Rubinowicz, 1918a). Sommer- 
feld’s exposition in Chapter 5 of his book Atomic Structure and Spectral Lines (1919) 
provides a wonderful review of the problems of reconciling the classical and quantum 
pictures of the process of emission of radiation by atoms which confronted the pioneers of 
quantum theory. Rubinowicz’s approach to the emission of an electron in a general elliptical 
orbit about the nucleus was adopted. They considered the radiation of a three-dimensional 
dipole P which has components 


Py = 4 exp (ia) , (6.43) 
P = p exp(iat) , P=P,+Py+p., Ppy=bexp(if), (6.44) 
Pp- = c exp(iy). (6.45) 


The amplitudes of the dipole in the x-, y- and z-directions are a, b and c and the phases of 
the oscillators are determined by the quantities a, 6 and y. Ifa = 6B = y = 0, the dipole 
is a linear oscillator with axis in the direction of P. A dipole rotating about the z-axis in 
the x-y plane could be described by a = b, c = 0, æ = 0, = +x /2. 

The properties of the radiation of an oscillating dipole were described in Sect. 2.3.1. 
More specifically, in the far field limit, the instantaneous electric field of the radiation is 
given by 








51 sind 
E _ |p| sin 





= j 6.46 
PT An Eoc?r (36) 
The rate of energy flow per unit area per second at distance r is given by the magnitude of 
the Poynting vector S = Ex H = Ej /Zyi,, where Zo = (po/&0)'’” is the impedance of 
free space. The rate of energy flow through the area r? dQ subtended by solid angle dQ at 
angle 6 and at distance r from the charge is therefore 


T) |p|? si? Og u | pl? sin? 6 


=> = —— sr = —~ dQ. 6.47 
dt 167? Zoegc*r? 1677?&0c3 AD 


Sr? dQ = — ( 
To find the total radiation rate —dE/dt, we integrate over solid angle. Because of the 
symmetry of the emitted intensity with respect to the acceleration vector, we can integrate 
over the solid angle defined by the circular strip between the angles 0 and 0 + dé, dQ = 





2r sin 0 dé: 
dE ” |p|? sin? 6 . 
-I—)= —_—2 ed. 6.48 
( dr ) i 1622e9c3 oon (648) 
We find the result 
dE 12 21412 
EBEN eee ai. (6.49) 
dt 6mEpc? = 6 Ege} 


This result is sometimes referred to as Larmor s formula, the result given in (2.1). 
These results illustrate the well-known features of the radiation of an accelerated charge 
that the radiation is emitted symmetrically with respect to the acceleration vector and the 
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6.5 The polarisation of quantised radiation 


energy loss rate is given by the Poynting vector flux which is radial. This energy also 
transfers the momentum of the waves to infinity, the momentum flow rate per unit area per 
second being S/c. However, because of the symmetry of the emitted radiation, there is no 
overall momentum loss by the particle. These results remain true for a linear oscillator and 
for the rotating dipole. 

There is, however, a difference in the case of the angular momentum, or the moment of 
the momentum, of the radiation field. The momentum per unit volume at distance r from 
the dipole is §/c? and so the moment of the momentum of the emitted radiation is 


M=rx (3) ; (6.50) 


c2 


This quantity has to be integrated over all space. This calculation is carried out by Som- 
merfeld in Appendix 9 of his book. In the case in which the dipole rotates about the z-axis 
in the x—y plane, the angular momentum of the radiation transferred per unit time is 


_ W 2ab siny 


ne. (6.51) 


where W is the total energy radiated by the orbiting electron and y = £ — a is the phase 
difference between the oscillations in the x- and y-directions. Let us check that this result 
makes sense. In the cases in which only one component of the oscillator is present, say 
b = 0, there is evidently no net angular momentum in the radiation, as would be expected 
from the symmetry of the radiated emission. Likewise, if y = 0 so that the oscillations in 
the x- and y-directions are in phase, there is no angular momentum since the particle is a 
linear oscillator at some fixed angle in the x—y plane. If, however, the phase difference is 
+rr/2 and a = b, the quantity 2ab sin y /(a? + b?) takes maximum and minimum values 
of +1. These values correspond to circular polarisation in opposite senses when viewed 
along the z-direction. Notice that these results are no more than the law of conservation 
of angular momentum for the emission of radiation. An electron in a circular orbit, for 
example, loses angular momentum as it radiates away energy and this is transferred to the 
radiation field which takes away the angular momentum as circularly polarised radiation. 
Sommerfeld now translates these classical results to the quantum emission of radiation. 
The energy removed in the quantum emission ofradiation is W = hv and so, since w = 27 v, 








h 2ab siny 


Denn, 52 
2n a*+b? en 


Now, according to the rules for the quantisation of angular momentum Jy = ngh/27 and 
so, when a quantum transition Ang is made, 


h 
AJ = — Ang. 6.53 
PT On ne ( ) 
Equating (6.52) and (6.53), we find 
A 2ab siny 6.54 
"e= Fb we 
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Now, we have just shown that the extremal values of the right-hand side of (6.53) are +1 
and so the only possible integral values of Ang are 


Sommerfeld’s analysis resulted in two key results for the quantum emission of atoms: 


« The rule of selection states: the azimuthal quantum number can at most alter by one 
unit at a time in changes of configuration of the atom. 

« The rule of polarisation demands that if the azimuthal quantum number alters by +1, 
the light is circularly polarised; if the quantum number remains constant, the light is 





linearly polarised. 


As we will discuss in the next chapter, Rubinowicz was to apply these concepts to the 
understanding of the Stark and Zeeman effects in hydrogen and other atoms with some 
success (Rubinowicz, 1918b). 

Thus, Bohr’s and Rubinowicz’s somewhat different approaches resulted in the same 
result that there are necessarily selection rules which determine which transitions in atoms 
are to be permitted. Note that there is an important difference between their approaches. 
Bohr worked exclusively in terms of the correspondence principle in which the transitions 
between states were subject to the quantum rules and the selection rules were found by 
analogy. In his picture, the radiation of the atoms themselves was determined by Maxwell’s 
equations. In contrast, Rubinowicz considered both the atoms and the radiation to be part 
of a single quantum system, as he described in his paper of 1921 (Rubinowicz, 1921). His 
analysis endowed the emitted photons with energy and angular momentum. 


6.6 The Rydberg series and the quantum defect 


The Bohr model of the atom could successfully account in considerable detail for the 
spectral properties of the hydrogen atom, but atoms with more than one electron were a 
much greater challenge. Nonetheless, the fact that the Rydberg series of atoms such as 
sodium, potassium, magnesium, calcium and zinc had similar forms to that of the hydrogen 
atom, as illustrated by the Rydberg formulae (1.19)-(1.21), suggested that the same general 
principles were involved. It is striking, for example, that the spectrum of sodium is similar 
to that of hydrogen. What was needed was a suitable form for the electrostatic potential 
experienced by an electron in the combined field of the nucleus and the other orbiting 
electrons. 

Sommerfeld tackled this problem in his book Atomic Structure and Spectral Lines (1919) 
by adopting a more general form for the electrostatic potential experienced by the electron. 
He wrote 


eE e? a\? a\3 
U=- +V, where V= cı (*) +o (=) +... |, (6.56) 
4T Eor AT Eor r r 
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6.6 The Rydberg series and the quantum defect 


where the scale a is chosen to be the radius of the first circular Bohr orbit, a = h?€)/mm,e?. 


Assuming the electron moves in a planar orbit defined by the coordinates r, & in this central 
electrostatic potential, the analysis is exactly the same as in Sect. 5.5, but now the potential 
term k/r in (5.104) and similar formulae is replaced by (6.56). Thus, the quantisation 
condition for & takes the usual form pg = ngh/2z. In the radial coordinate, Sommerfeld 
showed that, including only the first additional term in cı and carrying out the same type 
of integration which resulted in (5.116), the expression for the energies of the stationary 
states of the electron are of the form 


E = 6.57 
(ny + ng +d)’ we 
where d = Zc,/n°. Thus, the additional term in d depends upon the nature of the additional 
term in the potential. Similar results were obtained when the next term in c2 was included 
in the expression for the electrostatic potential. Evidently, the inclusion of these additional 
terms leads to energy levels of the type needed to account for the Rydberg formulae 
(1.19)-(1.21). 

The origin of this modification to the standard Bohr formula can be understood from 
the argument presented by ter Haar (1967). Considering for illustration the example of 
the sodium atom, the Bohr-Sommerfeld picture would involve a nucleus with charge | le, 
surrounded by 11 orbiting electrons. If a single electron is responsible for the spectral line, 
we can consider its motion in the combined field of the nucleus and the other 10 electrons. 
The potential due to the 10 other electrons can be represented by a spherically symmetric 
central field of force, an assumption which was to find some justification in the shell model 
of the electron distribution in atoms. Therefore, when the electron is far from the nucleus, 
it experiences a potential —e*/4zregr as the nucleus is shielded from all but 1/11 of the 
nuclear charge. On the other hand, when close to the nucleus, the electron experiences the 
full nuclear electrostatic potential, —1 le? /4meor. 

The specific form of the potential is not known, but we can derive the general features of 
the solution from the following line of reasoning. To find the energy of the stationary states, 
we need to find the integral equivalent to (5.110) which was evaluated for a pure inverse- 
square law electrostatic potential. Replacing the term 2mk/r by the general potential term 
2mU(r), the action integral becomes, 





2 
z= f fame + anise - Sr. (6.58) 
r 


since, according to classical mechanics, the radial component of the momentum of the 
electron is given by the expression 


2 a7 
pr: = 2mU(r) — 72 +2mE. (6.59) 


For the case of a pure inverse-square law of electrostatic attraction, U(r) = k/r and then 
the motion of the electron can be represented by a diagram in which the square of the radial 
momentum is plotted against distance from the nucleus (Fig. 6.1a). The total energy of the 
bound orbit is E and is a negative quantity. At the points 7min and Fmax, the radial component 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:53:12 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.007 
Cambridge Books Online © Cambridge University Press, 2014 





134 





Einstein coefficients 








(b) 
a) Illustrating the motion of an electron in a p-r diagram for an inverse-square law force of electrostatic attraction. 
b) The same motion plotted in a p,—r diagram (ter Haar, 1967). 


of the momentum goes to zero since all the momentum is then in the azimuthal component 
and the integral (6.58) is taken between these limits. Correspondingly, the radial motion of 
the electron can be represented on a p,—r diagram which is shown in Fig. 6.15. 

Now, the quantisation condition is 


i= Pr dr=n,h, (6.60) 


and so this integral corresponds to the area bounded by the locus in the p,—r relation shown 
in Fig. 6.1b. In other words, whatever the shape of the locus in the p,—r plane, the area is 
quantised in units of h. Let us now consider the case of the sodium atom. At large radii, 
the locus will have the same form as the hydrogen atom since it only feels a net single 
electronic charge. Close to the nucleus, the potential will be much deeper since the charge 
of the nucleus is 11e. Therefore, we would expect the p?-r relation to extend much closer 
to the nucleus, as illustrated in the comparison of the U(r) « 1/r potential and the potential 
felt by the electron in the sodium atom in Fig. 6.2(i). The corresponding quantities in the 
p,—r plane are shown in Fig. 6.2(ii). Now, the quantisation conditions for J, for the sodium 
atom is that the area enclosed by the locus (b) in Fig. 6.2(ii) is equal to n,h. This area is 
equal to the area enclosed by the locus for the hydrogen atom plus the shaded area which 
must be some fraction « of the total area. Therefore, we can write the quantisation condition 








2mek œ 
EE fame + WET ig, (6.61) 


r r? 
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6.7 Towards a more complete quantum theory 


(b) (a) 





(i) 





(i) Illustrating the motion of an electron in a p-r diagram for the potential of a hydrogen-like atom (a) and a sodium 
atom (b). (ii) The same motions plotted in a p,—r diagram (ter Haar, 1967). 





and so 
2mk až 
(n, —a)h = 2mE =, dr. (6.62) 
r r 
If we now carry out exactly the same analysis which led to (5.120), we find 
me* 1 
E = - —— (6.63) 


22h? (n — a)?’ 


where n = n, + ng + ng is the principal quantum number. This is the Rydberg formula for 
the energy levels of atoms quoted in (1.16).* 


6.7 Towards a more complete quantum theory of atoms 
eS 


Over a matter of a few years, Bohr’s theory of the hydrogen atom had been given a much 
more secure theoretical basis. The models of atoms were now fully three dimensional and 
the correspondence principle offered a route to understanding the intensity and polarisation 
properties of the emission of atoms. The selection rules limited the numbers of allowed 
transitions between stationary states. Despite the continuing concern about the incompat- 
ibilities between the quantum and classical pictures, there was optimism that these ideas 
provided the basis for the development of a more complete quantum theory of atoms. 

The subtle use of the correspondence principle was behind many of the successes of the 
theory. Sommerfeld remarked that the correspondence principle was like, 


“a magic wand that allowed the results of classical wave theory to be of use for the quantum 
theory.’ 
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To quote Jammer (1989), 


“The correspondence principle turned out to be a most versatile and productive conceptual 
device for the further development of the older quantum theory — and ...even for the 
establishment of modern quantum mechanics. ... In fact, there was rarely in the history 
of physics a comprehensive theory which owed so much to one principle as quantum 
mechanics owed to Bohr’s correspondence principle.’ 


As we will discuss in Chap. 8, Bohr’s imaginative use of the correspondence principle led 
to his theory of atomic structure and the origin of the periodic table. Bohr and Einstein held 
quite different views about how the problems of atomic structure and quantum phenomena 
should be addressed. By 1920, Bohr’s view was that the propagation of light is accurately 
described by Maxwell’s equations for the electromagnetic field, while the processes of the 
emission and absorption of radiation are quantum phenomena bolted onto a mechanical 
model of the atom. Einstein was critical of the insecure foundations of the old quantum 
theory, taking the view that the emission, absorption and propagation of radiation are all 
quantum processes. Despite these differing points of view, Einstein’s admiration for Bohr’s 
insight and achievements was unstinting. Many years later, Einstein wrote: 


‘That this insecure and contradictory foundation [of the quantum theory] was sufficient 
to enable a man of Bohr’s instinct and tact to discover the major laws of the spectral lines 
and of the electron shells of atoms together with their significance for chemistry appeared 
to me like a miracle — and it appears to me as a miracle even today. This is the highest 
form of musicality in the sphere of thought.’ (Einstein, 1949) 
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7.1 Optical spectroscopy, multiplets and the splitting of 
spectral lines 


137 


The achievements described in Chap. 6 represented a remarkable advance in the understand- 
ing of quantum phenomena, but there remained major challenges which were ultimately to 
undermine the successes of the old quantum theory. Continuing advances in spectroscopy 
enabled high resolution spectra to be obtained and, with the ability to place the sources of 
emission in strong electric and magnetic fields, the full complexity of atomic and molec- 
ular spectra became apparent. Atomic spectra display regularities, for example, the series 
spectra of elements such as sodium and calcium which could be described by the Rydberg 
formula (Sect. 1.6). Some of the most prominent spectral features consisted, however, of 
multiplets, meaning the splitting of a line into a number of separate lines with similar 
wavelengths. Examples of multiplets are illustrated in Fig. 7.1, derived from observations 
of the photosphere of the Sun from the Pic du Midi observatory. 

The simplest lines are singlets, the example of the Ha line of the Balmer series of 
hydrogen being shown in Fig.7.la. In fact, the line is a very narrow doublet, which 
Sommerfeld attributed to the effects of special relativity upon the circular and elliptical 
orbits of electrons of the same principal quantum number (Sect. 5.3). The splittings we are 
interested in here are very much larger effects. The classic example of a doublet is the 
splitting of the sodium D line into two bright components labelled D; (589.592 nm) and 
Dz (588.995 nm) (Fig. 7.15). A number of lines are triplets, for example, the magnesium 
triplet at 516.7 nm, 517.3 nm and 518.4 nm shown in Fig. 7. 1c. 

In addition to multiplets, individual lines are split into a number of components in the 
presence of electric and magnetic fields, the Stark and Zeeman effects respectively. The 
discovery and interpretation of the Zeeman effect has already been discussed in Sect. 4.1. 
The simplest variant is the normal Zeeman effect, illustrated schematically in Fig. 7.2a. 
Generally, however, the splittings are more complex. An important example is the splitting 
of the sodium D; and D; lines, in which the longer wavelength D; line is split into four 
components with no central line, while the shorter wavelength D; line is split into six 
components, again with no central line (Fig. 7.25). The Zeeman splitting of the triplet zinc 
line is illustrated in Fig. 7.2c, showing that one of the lines displays the normal Zeeman 
effect while the other two line splittings are similar to those of the sodium D lines. It was 
to prove to be a major challenge to understand the origins of the splitting of spectral lines 
by electric and magnetic fields. 
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Illustrating the multiplets of lines observed in optical spectra of the chromosphere of the Sun. (a) The singlet line of 
Hæ. (b) The sodium D; (589.592 nm) and D; (588.995 nm) lines. The spectrum also includes the neutral helium line at 
587.6 nm. (c) The magnesium triplet at 516.7 nm, 517.3 nm and 518.4 nm. (Courtesy of J-P. Rozelot, V. Desnoux and 
C. Buil. These observations were made at Pic Du Midi ‘Lunette Jean Rosch’ with the eShel spectrograph.) 
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Schematic diagrams illustrating the normal and anomalous Zeeman effects. (a) The Zeeman splitting of a singlet line. 
(b) The Zeeman splitting of the doublet D lines of sodium. (c) The Zeeman splitting of the triplet lines of zinc. 


7.2 The Stark effect 





Following the discovery of the splitting of emission lines in magnetic fields by Zeeman in 
1896, Woldemar Voigt predicted in 1901 that there should be an analogous effect if atoms 
are subject to an electric field (Voigt, 1901). He estimated theoretically the magnitude of 
this effect for an elastically bound electron in an atom, but found the predicted splitting 
to be too small to be measured experimentally. Despite this discouraging prediction, Stark 
investigated the effect experimentally in 1913 using the emission of ‘canal rays’, the ions 
which passed through the perforated cathode of a discharge tube in the opposite direction 
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(a) A schematic diagram showing the experimental arrangement for the production of canal rays. Positively charged 
particles are accelerated from the anode to the cathode and pass through the holes in the cathode. The result is a set of 
‘canal rays’ as illustrated in the diagram which strike the walls of the discharge tube. (b) The Stark effect showing the 
splitting of the 438.8 nm line of helium. The strength of the electric field increases down the diagram. In the left-hand 
panel, the light is observed parallel to the electric field while in the right-hand panel, the light is polarised 
perpendicular to the electric field (Foster, 1930). 


to the electron beam (Fig. 7.3a) (Stark, 1913) — this apparatus was the precursor of the 
mass spectrograph which was to be perfected by Francis Aston. In his laboratory at the 
Technische Hochschule in Aachen, Stark discovered the splitting of the Balmer series of 
hydrogen and the lines of helium into a number of components when the canal rays were 
subjected to a strong electric field. Almost contemporaneously, Antonino Lo Surdo at 
Florence had detected a similar effect in his discharge tubes (Lo Surdo, 1913). Stark found 
that, when viewed perpendicular to the direction of the electric field, the Ha and H£ lines 
were split into five components, the central components being polarised perpendicular to 
the direction of the field — the outer components were polarised parallel to the field. When 
viewed along the electric field direction, three unpolarised components were observed. The 
separation of the lines was symmetrical about the central line and, to a first approximation, 
proportional to the electric field strength. In subsequent experiments, Stark showed that 
there was no further splitting of the Ha line, but H was split into 13 components, Hy 
into 15 and Hö in 17 components in both the transverse and longitudinal directions (Stark, 
1914). In the case of helium, six components were observed: three with polarisation parallel 
and three with polarisation perpendicular to the field direction. More complex splittings of 
the emission lines of helium and heavier elements were observed as greater electric field 
strengths became available (Fig. 7.35). 

In the same year, Warburg and Bohr realised that Bohr’s model of the hydrogen atom 
offered an explanation of the Stark effect (Warburg, 1913; Bohr, 1914). Suppose the electric 
field acts in the z-direction. Then, in addition to the electrostatic force of the nucleus, 
f = Ze? /4reor’, there is a perturbing electrostatic force on the electron due to the field 
which does work —eEz, where z can be taken to be the distance from the nucleus in the 
z-direction. The nucleus is assumed to be infinitely heavy and so remains stationary. The 
change of energy of the stationary state is found by averaging z over the orbit, so that 
AE ~ eEz., The effect of the perturbing field clearly depends upon the orientation and 
ellipticity of the orbit of the electron, but it is apparent that the result will be to remove the 
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= const &=0 oS ale 
(a) The system of confocal parabolae parameterised by coordinates & and 7 which form a set of parabolic coordinates 
used in the analysis of the Stark effect. (b) Illustrating the dynamics of the electron within the bounding quadrangle 
formed by the parabolae (Emax, Emin) and (max, min) (Sommerfeld, 1919). 


degeneracy of the energy levels of hydrogen-like atoms because of the differing ellipticities 
of their orbits for a given principal quantum number (Fig. 5.2) — we recall that elliptical 
orbits, with the same semi-major axis and with the nucleus in one focus, all have the same 
total energy. Inspection of the orbits in Fig. 5.2 suggests that the actual splitting depends 
upon the quantum numbers n and n’ and that Z should be of the order a = €yh?n?/mm,e?Z, 
the radius of a circular Bohr orbit with principal quantum number n. Hence, the change in 
energy AE, of the nth energy level is expected to be of the order 





(7.1) 


Bohr’s estimates of the magnitude of AE, provided a good account of the Stark effect in 
hydrogen, but he understood that a complete solution would require the determination of 
the perturbed orbit of the electron in the atom in the presence of a uniform electric field. 

The solution was provided almost simultaneously by Paul Epstein! (1916a) and Karl 
Schwarzschild? (1916). They used the action—angle variables introduced in Sects. 5.4.4 
and 5.5 and followed the procedures developed by Delaunay (1860) in his studies of 
the perturbing effects of distant planets upon the dynamics of the Earth-Moon system. 
Sommerfeld (1919) provides an elegant description of these calculations. 

These authors began with the more general problem of the motion of an electron, or 
satellite, in the field of two centres of attraction. The astrophysicists had shown that the 
problem can be reduced to Hamilton—Jacobi form in which the coordinate system consists 
of a family of confocal ellipses and hyperbolae with the two planets as centres of attraction 
in each focus. If one of the centres of attraction is then taken to infinity while the force of 
attraction at the other is kept constant, the result is the same as a uniform field at the other 
centre of attraction. In this limiting case, the system of coordinates becomes a set of confocal 
parabolae parameterised by coordinates & and n as illustrated in Fig. 7.4a. To complete the 
three-dimensional coordinate system, the angle w is taken to be the polar angle with respect 
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to an axis through the focus perpendicular to the &-n plane. The transformations from (x, y) 
to (£, 7) coordinates are: 


y2 2 

teas, — ken. (7.2) 
We now follow the procedures for finding the orbit of the electron in (£, n, Y) coordinates 
according to the Hamilton-Jacobi prescription of Sect. 5.5. The total energy Etot of the 
electron, or its Hamiltonian H, is found to be 





Ev = H = ey ++ en )» m eE -1| 
9 2m.(&? + n2) E n £2 m v Teg S 
(7.3) 
The equations for the velocities and momenta in (£, 7, Y) coordinates are given by Hamil- 
ton’s equations (5.69), which become 


dH Ps dH Pn , _ 3H Py 








E= = ‚= = » w= — = —.. (74) 
ODE mel? + n?) OPy m(&? + n?) OPy m?n? 
P 0H an oH en 0H =, (75) 
Pg = a Ph = an Py = 3y . . 


The importance of working in parabolic coordinates is that the Hamiltonian (7.3) is sep- 
arable and furthermore, from the last equation of (7.5), the py momentum is a constant, 
corresponding to the conservation of angular momentum. The details of the calculations to 
find the expressions for p; and p, are given by Sommerfeld (1919). The result is a system 
of conditionally periodic orbits similar to those illustrated in Fig. 5.4, but now in parabolic 
coordinates (Fig. 7.45). Again the orbits just touch the bounding quadrangle formed by the 
parabolae Emax, Emin and Nmax and Nmin. The orbits would eventually pass through every 
point within the bounding quadrangle. 

The next step is to apply the Bohr-Sommerfeld quantisation conditions (5.117)-(119) in 
the (£, n, Y) coordinate system, 


2r 
$ pde=mh. Poran=nh, | pau=nh. (7.6) 
0 


where nı, na and n3 are positive integers. nı and nz are referred to as parabolic quantum 
numbers while n3 is known as the equatorial quantum number. Carrying out the calculations 
for the energy levels of hydrogen-like atoms to first order in the electric field £, Epstein 
and Schwarzschild found the result 

meZ-e4 1 Shen E 


u 77 
Sesh? (nı +n + n3) + ama nı)(nı +n + n3) ( ) 





We recognise immediately that the first term on the right-hand side of (7.7) is the same as 
the expression (5.120) for the energy levels of the Bohr-Sommerfeld atom. In addition, the 
quantisation conditions applied to these orbits in the presence of the electric field results in 
a splitting of each energy level, as given by the second term on the right-hand side of (7.7). 
The term is of the same form as that derived from our order of magnitude calculation (7.1), 
but now the dependence of the splitting of the energy levels on the quantum numbers n1, n2 
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and n; have been determined. The frequency splittings associated with transitions between 
an initial stationary state (mı, m2, m3) and a final state (n1, m2, n3) is 


=, 3hegE 
~ 2nm.Zle| 


Finally, Sommerfeld’s selection rules for radiative transitions between stationary states 
could be applied to determine the predicted splittings of the lines of the Balmer series in 
hydrogen. Epstein found that these rules provided a complete description of Stark’s data on 
the splittings of the lines of the Balmer series, as well as showing empirically that even- 
numbered differences in m3 — n3 give rise to polarisation in the direction of the external 
field, while odd-numbered differences give rise to polarisation perpendicular to the electric 
field. These calculations were of the greatest importance for promoting the case for the old 
quantum theory. In particular, as seen from (7.8), the splitting of the Balmer lines in the 
Stark effect depends upon Planck’s constant h, in contrast to Lorentz’ explanation of the 
normal Zeeman effect which does not depend upon h. The argument provides independent 
support for Bohr’s postulate of the central importance of quantisation effects on the atomic 
scale. As Epstein (1916b) remarked 


Av [(m, — m2)(m, + ma + m3) — (nı — na)(nı +n2 + n3)]. (7.8) 


“We believe that the reported results prove the correctness of Bohr’s atomic model with 
such striking evidence that even our conservative colleagues cannot deny its cogency. 
It seems that the potentialities of quantum theory as applied to this model are almost 
miraculous and far from being exhausted.’ 


Sommerfeld was equally effusive: 


‘All in all, we may regard the theory of the Stark effect as one of the most striking 
achievements of the quantum theory in atomic physics.’ (Sommerfeld, 1919) 


7.3 The Zeeman effect 
CS SaaS 


In the mid-1890s, Lorentz and Larmor had accounted for Zeeman’s discovery of the broad- 
ening and splitting of spectral lines in the presence of strong magnetic fields by the action of 
the magnetic field upon the oscillating or gyrating ‘ions’ in atoms (Sect. 4.1). The splitting 
of the line consisted of three components, a central line plus two lines, equally spaced on 
either side of it with displacements 

eB eB 


Aw = + Avo = + : 7.9 
j 2me a É Anme a) 














This splitting was referred to as the normal Zeeman effect. Zeeman was fortunate in that he 
only observed broadening of the D lines of sodium, whereas he detected the characteristic 
triple splitting of the spectral lines of cadmium and zinc, which are singlet lines and display 
the normal Zeeman effect. When Preston carried out his experiments on the Zeeman effect 
in sodium using the powerful electromagnet at the Royal College of Science in Dublin, he 
found something quite different (Preston, 1898). In his words, 
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‘It is interesting to notice that the two lines of sodium and the blue line 4800 (A) of 
cadmium do not belong to the class which show as triplets. In fact, the blue cadmium line 
belongs to the weak-middled quartet class, while one of the D lines (D2) shows as a sextet 
of fine bright lines. ... the other D line (D1) shows as a quartet...’ 


These splittings of the D-lines were confirmed by Cornu (1898), the D; line being a 
quadruplet and the D; line a sextuplet (Fig. 7.25). These phenomena were referred to as the 
anomalous Zeeman effect. 

Following the development of the Bohr-Sommerfeld quantum model of the atom, the 
challenge to theorists was to apply these concepts to the description of the Zeeman effect, 
in the hope that this would cast light upon the anomalous Zeeman effect. This was taken 
up by Debye (1916) and Sommerfeld (1916b) who used the formalism of action—angle 
variables which proved to be ideally matched to the requirements of quantum theory. The 
motion of the electron was determined by the combined influences of the electrostatic field 
of the nucleus and the Lorentz force f = e(v x B) associated with the magnetic field B. 
If the magnetic field is homogeneous and uniform along the z-direction, the equations of 
motion are (see also (4.2)-(4.4)): 


aV 
mx = eBy — —, (7.10) 
ox 
aV 
m.y = —eBx — —, (7.11) 
dy 
2s ƏV 
MŽ =—-—, (7.12) 
oz 
where we have written B = |B| and V = — Ze? /4r eor. To convert these equations into a 


form suitable for the application of the Hamilton—Jacobi equations, we need the Hamiltonian 

for the system. Because of the v x B term, no work is done by the magnetic field on the 

electron and so the total energy of the system is 
1 1 


RP PL ear : 
g 2 ° a 2me 





(p. + p+ p:)+V. (7.13) 


To convert this into a Hamiltonian, we need to replace the electron’s three-momentum by 
its conjugate momenta which is p; = mex; + eA;, where A; is one of the three components 
of the vector potential A, defined by B = curl A and i = 1, 2, 3 (see Sect. 5.4.3). For a 
uniform magnetic field in the z-direction, 





B, , À 
A= 5 (vir —xiy) , (7.14) 
and so 
eyB exB 

Spats REIP- > B= p. (7.15) 

Therefore, the Hamiltonian for the electron in a uniform magnetic field is 

1 eyB . exB\ 2 
H =E = Pı + | po+— ——) +p3|+V. (7.16) 
2m. 2 2 
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The next step, which was carried out by Debye, is to transform the Hamiltonian from 
Cartesian (x, y, z) to spherical polar (r, €, p) coordinates. Carrying out this point trans- 
formation, 








1 p2 Ps Ze? 
H= 24 2 Bps | — ; 7.17 
2me (r * r2 T r2 sin? 0 CRG 4T Eor (17) 


This Hamiltonian is separable and does not depend upon ¢. ¢ is therefore a cyclic variable for 
which we can write py = constant = a3. Following the procedures described in Sect. 5.5, 
the Hamilton—Jacobi equation can be written 


aw ae aw ze js Me Gor 7.18) 
or r2 \ 00 r? sin? 6 ES 2T Eor a ns : 








where w = eB/2m, and œ; is the negative total energy constant. The equation (7.18) is 
separable in the r and @ coordinates and the constants of the motion are given by 





1/2 
ow 2 a? 
= — = — —— ; 7.19 
Pe = zo (o sin? 6 ee 
9W x 2 2 
pr = = (om es = oma, 2) , (7.20) 
r T Eor r 


where q@ is the third constant of the motion. Finally, we apply the Bohr-Sommerfeld 
quantum conditions: 


h r 
f pod = 2703 = m3h , | po dd = mh , 2 | py dr =mıh. (7.21) 


Oo; ry 


Carrying out the second and third integrals of (7.21), we find 
2n( )= moh a 2 h (1.22) 
(2 — 3) = moh, — - 27a, = mh. ; 
2 — Q3 2 ie, Jar ai 2 1 
The expression (7.22) can be inverted to express the quantised energy levels, parameterised 
by En = 01, as 


meet m3hw 


Ey, = Sh AG, 
"T Benth | On 





(7.23) 


where n = mı + m + m3. From this expression, we find from Bohr’s frequency condition, 





met (1 1 oOo. , P 24 
Oe 8egh° (= =x) P 20 na: en 
where the single primed quantities characterise the final orbit and the double primed 
quantities the initial orbit. This is exactly the same as Bohr’s original formula for the Balmer 
series of the hydrogen atom (4.28), but now it involves elliptical orbits parameterised by 
mı, m and m3 and includes the effects of the magnetic field on the energy of the orbits. 
We recognise that m3 is the azimuthal quantum number ny introduced in Sect. 6.4 and, 
following the considerations of Sect. 6.5, the selection rule for m3 is the same as (6.55), 
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namely 
Am3 = +1, Am; = —l1, Am3=0. (7.25) 


These splittings correspond exactly to the normal Zeeman effect and the polarisation 
properties are identical to those described in Sect. 6.5, namely the central line Am3 = 0 is 
linearly polarised and the Am3 = +1 components are circularly polarised. It can also be 
observed that the Zeeman splitting term in (7.24) does not contain Planck’s constant and so 
is an entirely classical term. Sommerfeld was well aware of the fact that, despite the more 
sophisticated tools used, the analysis provided no more than the classical results derived by 
Lorentz and Larmor. As expressed by Sommerfeld 


‘In the present state the quantum treatment of the Zeeman effect achieves just as much as 
Lorentz’s theory, but not more. It can account for the normal triplet . . . but hitherto it has 
not been able to explain the complicated Zeeman types.’ 


Sommerfeld went on to consider the relativistic case, but this did not provide any further 
insights (Sommerfeld, 1916b). 

Sommerfeld labelled m3 the magnetic quantum number because of its appearance in 
(7.23) in the guise of w = eB/2m, where m, is the mass of the electron — m3 reappears in 
the full theory of quantum mechanics as the quantum number m. The quantum number m3 
corresponds to the projection of the total angular momentum vector, characterised by the 
quantum number k, onto the magnetic field direction and hence the energy term associated 
with the magnetic field can also be written 


m3hw kh eB 
= cosa , (7.26) 
27 27 2me 


where |L| = kh/2z is the magnitude of the total angular momentum and « is the angle 
between the magnetic field direction and the total angular momentum vector L. Introducing 
the magnetic moment of the electron u = (e/2m.)L in its orbit about the nucleus, this 
energy can be written w- B, the interaction energy between a magnetic dipole and a 
magnetic field. If we consider the magnetic moment of an electron in the lowest energy 
Bohr orbit, k = 1, we find that |u| = (e/2m.)(h/2r) = eh/4rrm.. Pauli introduced the 
term Bohr magneton to refer to this natural unit of magnetic moment which is written 
up = eh/4am, = eh/2m.. 





7.4 The anomalous Zeeman effect 
aaa Th m m —— — EE| 


While the Bohr-Sommerfeld model of the atom could account for the normal Zeeman 
effect, it could not account for the much more complex splittings observed in the anomalous 
Zeeman effect. The origin of the multiplicities of spectral lines was clearly related to the 
anomalous Zeeman effect since singlet lines displayed the normal Zeeman effect while the 
higher multiplets all exhibited anomalous Zeeman effects. Although the Bohr-Sommerfeld 
model could account for the properties of hydrogen-like atoms, multiple electron systems 
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proved to be intractable — models for helium with two electrons and a nucleus with two 
opposite electric charges could not be reconciled with the experimental results. Sommerfeld 
realised that, with the analyses of Zeeman splitting discussed in Sect. 7.3, he had gone as far 
as he could with the standard procedures of action—angle variables and a different approach 
would be required. His efforts were devoted to finding empirical relations which would give 
insight into the origin of the anomalous Zeeman effect. 

The Rydberg series of elements such as sodium and calcium could be accounted for in 
terms of a modified variant of the Bohr-Sommerfeld model in which the shielding of the 
nuclear charge by the orbiting electrons was included in the expression for the electrostatic 
potential, as was demonstrated in Sect. 6.6. This modified model could also account for 
the multiplicities of spectral lines — just as in the Stark effect the degeneracy associated 
with the Balmer series was lifted by an applied electric field, so the internal electric field 
associated with the multiple electron systems could lead to the lifting of the degeneracy 
of the energy levels and change their energies. We recall that the degeneracy of the states 
of hydrogen-like atoms is associated with the assumption of the pure inverse-square law 
electrostatic potential of the nucleus — the presence of additional electrons results in a 
non-inverse-square law of attraction close to the nucleus. 

These considerations led to a natural ordering of the various series in terms of the mag- 
nitudes of the quantum defects described by the quantities s, p and d in (1.19), (1.20) and 
(1.21). Sommerfeld had shown that the properties of the ellipses could be characterised 
by the two quantum numbers n, which determined the energy of the stationary state and k 
which determined the total angular momentum of the orbit (Sect. 5.5). The emission lines 
were assumed to be associated with the ‘valance electrons’, the outermost electrons of the 
atoms. Thus, the Rydberg terms with the smallest quantum defects could be associated with 
valance electrons which had circular orbits and so experienced the fully shielded nuclear 
charge whereas the ‘penetrating’ orbits with large eccentricities would be expected to have 
the greatest quantum defects. The quantum defects of the different series increase in the or- 
ders, p,d, f,... Therefore, it was argued that the series with the greatest quantum defects 
should have the most ‘penetrating’ orbits, corresponding to the maximum eccentricity for a 
given principal quantum number, that is, k = 1. On the other hand, those with the smallest 
quantum defects should be associated with circular orbits for which n = k. This led to 
the association of the quantum numbers k = 1, 2, 3, 4,... with the terms s, p,d, f,..., 
corresponding to the designations associated with the sharp, principal, diffuse and funda- 
mental series. 

The issue arose whether or not the quantum numbers n and k, together with the selection 
rules for k, were sufficient to account for all the features observed in atomic spectra. 
The introduction of the magnetic quantum number showed that this feature had to be 
accommodated in the rules. Sommerfeld was also concerned by the fact that some transitions 
which would have been permitted according to Ritz’s combination principle were not in fact 
observed (Sommerfeld, 1920a). This suggested that some other selection principle was at 
work, associated with an additional inner quantum number, which at Bohr’s suggestion, he 
named j. Thus, each state would be characterised by three quantum number n, k and j. For 
example, for a state with n = 5, k = 1 and j = 3, Sommerfeld introduced the notation >s3, 
since k = | characterised the s states. From his analyses of the doublets and triplets of the 
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diffuse series, Sommerfeld was able to assign numerical values of j such that the selection 
rule for 7 became Aj = +1 or 0. Alfred Lande took the analysis further and showed that 
the transition 7 = 0 to j = 0 also had to be excluded. Sommerfeld was well aware of the 
fact that these were empirical rules without any underpinning theoretical justification. 

To give more physical significance to the introduction of this additional quantum number, 
Sommerfeld and Landé developed their magnetic core hypothesis according to which the 
atomic core, consisting of the nucleus and the inner, non-valence, electrons, was attributed 
an angular momentum sh/2z, where s took values 1, 2,... Associated with s was the 
corresponding magnetic moment u = (e/2m.) L which was to be aligned with the angular 
momentum vector of the core (Sommerfeld, 1923, 1924; Landé, 1921a,b, 1923a,c). This 
may be thought of as an ‘inner Zeeman effect’, the angular momentum being subject to the 
rules of quantisation and so the vector could only take specific quantised orientations with 
respect to any given axis. Such a hypothesis had already been introduced by Roschdest- 
wensky in connection with the doublet separation of the first line of the principal series of 
lithium (Roschdestwensky, 1920). 

Let us return to some of the early results which stimulated Landé’s interest in the 
anomalous Zeeman effect. Gradually the patterns of Zeeman splittings began to show 
significant regularities. In 1898, Preston formulated his law that 





‘All the lines of a given series of a substance exhibit the same pattern of components in 
a magnetic field; moreover, analogous spectral lines of the same series, even if belonging 
to different elements have the same Zeeman effect.’ (Preston, 1898) 


By ‘series’ is meant columns in the periodic table containing similar elements, such as 
the sequence of the alkaline metals sodium, potassium, rubidium (Group 1) or the alkaline 
earth metals magnesium, calcium, strontium, barium (Group 2). This was followed by the 
rule discovered by Runge and Paschen (1900) to the effect that, 


‘the frequency differences for lines of the same type (multiplicity) turned out to be the 
same.’ 


Of particular interest for interpreting the nature of the anomalous Zeeman effect was 
Runge’s analysis of 1907 in which he provided a remarkably simple expression for the 
frequency, or wavelength, dependence of the splittings of the spectral lines (Runge, 1908). 
The normal Zeeman effect could be written 


pe hyn SS SS. (7.27) 


In Runge’s words, 


‘The hitherto observed complex splittings of spectral lines in a magnetic field exhibit 
the following peculiarity: the distances of the components from the centre are integral 
multiples of just a fraction of the normal separation a ...So far, the fractions a/2, a/3, 
a/4,a/6,a/7,a/1\1 and a/12 have been definitely observed. That is, . . . dv the frequency 
difference(s) were integral multiples of the quantity a/r, where... the denominator r is 
an integer between | (for the normal Zeeman effect) and 12.’ 


The quantity r became known as the Runge denominator. 
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Exceptions to the rules concerning the appearance of the anomalous Zeeman effect 
were discovered by Voigt and Hansen in helium and lithium (Voigt and Hansen, 1912). 
Particularly in the case of lithium, it had been expected that the lines would display the 
same anomalous Zeeman effect as observed in sodium and potassium, but instead the normal 
Zeeman effect was observed. As stronger magnetic fields became available, Paschen and 
Back found that at large enough magnetic flux densities, the anomalous Zeeman effect was 
replaced by the normal effect (Paschen and Back, 1912). Thus, in the case of the oxygen 
triplet, for example, below 0.6 T, the three lines separated into the standard anomalous 
Zeeman effect. At 0.6 T, however, the effect of the strong magnetic field was that the 
different natural line components overlapped and, at greater magnetic flux densities, a 
transformation began to take place until at magnetic flux densities of 4 T, the pattern of 
line splittings reverted to the normal Zeeman effect with the standard three lines. This 
Paschen—Back effect could account for all the exceptions to Preston’s law. 

Sommerfeld appreciated that the interpretation of the anomalous Zeeman effect had 
the potential to unlock further insights into quantum physics. His concept was to give 
further physical meaning to the empirical rules associated with the anomalous Zeeman 
effect. In 1919, he introduced his decomposition rule starting from an analysis of the Runge 
denominators (Sommerfeld, 1920b). Sommerfeld began with Runge’s rule written in the 
form Ava, = ga/r, where q, the integral numerator, took values 0, +1, +2,...and the 
characteristic values of r were 1, 2 and 3 for singlet, doublet and triplet lines respectively. 
According to the precepts of quantum theory, the lines had to originate between stationary 
states of the atoms in the magnetic field which had energy displacements with respect to 
the unperturbed states AW, and AW, so that 

















AW, — AW: 
Avan = AS Av = nz (7.28) 
To obtain Runge’ rule, the frequency shifts Av; also had to satisfy the rule and so 
Av; = ae , wherei = 1, 2, (7.29) 
Fi 


where qı and r; are the Runge numerators and denominators associated with the individual 
energy states. Hence, from (7.28) Runge’s rule could be written in the following form 


A Van = 





1 _ (= ne) ge 2. (7.30) 


r Fi F2 Firs 


Now, Sommerfeld associated the Runge denominators with the properties of the energy 
levels themselves and not just the transitions. In his words, 


‘We denote the equation, r = rır2, as the magneto-optical decomposition rule and the 
Runge number observed in the anomalous Zeeman effect can be decomposed into Runge 
numbers of the first and second terms involved in the spectral lines.’ (Sommerfeld, 1920b) 


This was the beginning of a complex story of the development of empirical rules to 
describe the many features of the anomalous Zeeman effect.” The principal contributors 
to these developments were Landé, Sommerfeld and the young Werner Heisenberg. Landé 
rapidly became the authority on the anomalous Zeeman effect, particularly after he joined 
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R 


Illustrating Landé’s construction for the vector combination of angular momenta R and X to form the total angular 
momentum vector J. The separate angular momentum vectors R and K precess about the resultant vector J. 


Paschen’s group at Tübingen in 1922 as an extraordinary (associate) professor of theoretical 
physics. At that time, Tübingen was the leading centre for experimental studies of the 
anomalous Zeeman effect and Paschen needed Lande’s theoretical expertise to make sense 
of the experimental results. 

In 1919, Landé introduced the vector model of the atom to represent the addition of 
angular momenta by analogy with their vectorial representation in classical mechanics 
(Lande, 1919). He represented the angular momentum of the core by the vector R which 
was to be combined vectorially with the angular momentum K of the electron responsible 
for the optical emission to form a resultant vector J which represented the total angular 
momentum of the atom (Fig. 7.5). As in classical mechanics, the separate R and K vectors 
were assumed to precess about the resultant angular momentum vector J. In the presence 
of a magnetic field, J itself would precess about the magnetic field direction. R and K had 
to be combined according to the rules of quantisation and so would combine at different 
angles, resulting in a variety of different stationary states with different total energies. As a 
result of the magnetic interaction between the magnetic moments of the core and the outer 
electron, the energy levels were different and Lande attributed the different energies of the 
multiplets to this interaction. 

Both Sommerfeld and Landé developed sets of empirical rules to describe the anomalous 
Zeeman effect. The vector model of the atom with integral quantum numbers ran into the 
difficulty that the singlet states would be associated with a core quantum number s = 1 
which implied that the orbital quantum number / = 0, which was contrary to the assignment 
of k = | according to the standard Bohr-Sommerfeld model. A different assignment of 
quantum numbers was proposed by Landé (1921a; 1922) which was widely adopted. In 
it, half-integral quantum numbers were introduced with the following assignments: R = 
1/2, 2/2, 3/2,... for singlet, doublet, triplet, . . . systems and X = 1/2, 3/2, 5/2,... for 
the s, p, d,...states with J half-integral for odd multiplicities and integral for even 
multiplicities as the quantum number associated with the sum of the R and K vectors. In 
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this scheme, R = s + 5 and K =/+ L. The objective of these schemes was to account 
simultaneously for the numbers of splittings of the lines, their polarisation properties and 
their separation from the lines in the absence of a magnetic field. 

To account for the spacing of the lines in the anomalous Zeeman effect, Landé worked 
by analogy with the normal Zeeman effect in which the energy levels were given by (7.23) 
AE, = m3hvo, where vo is the gyrofrequency. m3 is the magnetic quantum number which 
we showed was the same as the azimuthal quantum number in the Bohr-Sommerfeld model 
of the atom. Lande then wrote the splitting of the lines in terms of the quantum number m 
with a splitting factor g which determined the actual splitting in terms of combinations of 
the vectors R, K and J so that AE = gmhvo. Empirically, he found the formula 


Paek | P+R-R? 





g=1+ x-5 7 aD (7.31) 

where 

J=/(J+3)(J—-5), R= /(R+3)(R-4) and K=,/(K+4\(K—}). 
(7.32) 


This formula closely resembles the modern expression for the Lande g-factor. 

In order to give more physical content to the theory, Landé next reverted to the vector 
model of the atom, but now using J, R and K in place of J, R and K. From the geometry 
of Fig. 7.5, the cosine rule can be used to find the angle between the vectors J and R and 
then 


R cos(J, R) 
7 


If H is the direction of the magnetic field, then the quantisation of the angular momentum 
vector associated with J is given by the magnetic quantum number m such that m = 
J cos(J, H) and so 


g=1+ (7.33) 


mg = J cos(J, H) + R cos(J, R) cos(J, H) . (1.34) 


Now the product of the two cosines in the last term of (7.34) is just the average value of 
cos(R, H) and so the value of mg can be written 


mg = K cos(K, H) + 2R cos(R, H) . (7.35) 


This is a rather dramatic result since it might have been expected that, if the angular 
momentum vectors had been added together, then their projections on the magnetic field 
direction would have resulted in 


mg = K cos(K, H) + R cos(R, H), (7.36) 


in other words, the second term associated with the magnetic core contributed twice what 
might have been expected according to Larmor’s formula for the precession of the vector 
about the magnetic field direction. Land& made the suggestion that there might be an 
anomalous magnetic moment associated with the core such that the ratio of the angular 
momentum to magnetic moment was only half of the Larmor value, that is, an anomalous 
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gyromagnetic ratio. Remarkably, there was independent experimental evidence for the 
anomalous gyromagnetic ratio from the experiments of Einstein and de Haas and confirmed 
by the careful experiments by Beck, as discussed in the next section. 

Landé’s achievement was quite remarkable, but there remained a lot to be explained, 
in particular, what was the significance of the introduction of J = [(J + D — DJ” 29 
Sommerfeld used a slightly different notation for J such that J = j + 5 and so the quantity 
J is 


J=V/jG+D. (7.37) 


To the modern reader, this has a familiar ring but in 1924 its significance was obscure. 
Landé’s achievement was to infer these new features of quantum physics from analysis of 
the anomalous Zeeman effect, before the invention of quantum mechanics. 


7.5 The Barnett, Einstein-de Haas and Stern—Gerlach experiments 
wu 


7.5.1 The anomalous gyromagnetic ratio 


The value of the gyromagnetic ratio for atoms had been the subject of various experi- 
ments which originated in attempts to understand the phenomenon of the magnetisation 
of materials in terms of the magnetic moment per unit volume associated with what were 
termed “Amperian currents’. To recapitulate the argument, an electron in a circular orbit 
corresponds to a circular current / = ev, where v is the frequency of rotation of the electron 
about the nucleus. If the area of the orbit is A = zr, the magnetic moment of the electron 
is y = IA =evA. The angular momentum of the electron is L = m,ur, and so, since 
v = v/2rr, the gyromagnetic ratio is defined to be 

L 2me 

- = : (7.38) 

u e 





The same result is obtained for the bulk properties of a material if the orbits are aligned. 

Therefore, there should be a relation between the magnetisation of a material and its angular 

momentum. In macroscopic terms, 
IL| _ 2m. 
|| e 





= 1.13 x 107! in SI units . (7.39) 


Samuel Jackson Barnett realised that it would be possible to measure such an effect using 
a cylinder of iron which initially is at rest with zero magnetisation. Then, if the cylinder 
acquires angular momentum about its long axis, it should become magnetised. After along 
series of careful experiments carried out at the Physical Laboratory at Ohio State University, 
he eventually found a positive effect and measured its magnitude, what became known as 
the Barnett effect (Barnett, 1915). 

A similar experiment was proposed by Einstein and Wander Johannes de Haas who 
looked for an ‘AC’ effect by switching the direction of magnetisation of the cylinder at the 
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frequency of the torsion filament which was used to suspend the cylinder (Einstein and de 
Haas, 1915). They found the expected value given by (7.39) with an accuracy of about 10%, 
but this was about a factor of 2 greater than that found by Barnett. Over the next few years, 
these tricky experiments were repeated by Emil Beck, Gustav Arvidsson and the Barnetts 
with the result that the gyromagnetic ratio converged on a value half of that given by (7.39) 
(Beck, 1919a,b; Arvidsson, 1920; Barnett and Barnett, 1922). This phenomenon became 
known as the anomalous gyromagnetic ratio. 


7.5.2 Space quantisation and the Stern—Gerlach experiment 


The possibility of demonstrating experimentally spatial quantisation as predicted by Som- 
merfeld’s analysis of the Stark effect was taken up by Otto Stern at the University of 
Frankfurt, despite the scepticism of the theorists who viewed quantisation as simply a for- 
mal set of procedures. In 1920, Stern used the method of atomic beams to measure the 
mean velocity of neutral silver atoms and found indeed that the experimental value agreed 
with the expectations of Maxwell’s kinetic theory (Stern, 1920). He also planned to mea- 
sure the velocity dispersion of the atoms, but this would have required many refinements 
of the experimental set-up. Instead, he turned to the issue of measuring space quantisation 
experimentally. He was joined at Frankfurt by Walther Gerlach, a gifted experimenter, to 
attempt the much more difficult task of demonstrating the splitting of spectral lines in a 
strong magnetic field gradient. It is a straightforward calculation to show that, although 
there is no net force on a magnetic dipole in a uniform magnetic field, there is a net force 
if the magnetic field is inhomogeneous. The net force on the dipole is 





dB, 
F=(m-V)B=|m|cos@ FF (7.40) 
z 


where in the last expression it is assumed that the magnetic field gradient is in the z-direction 
and @ is the angle between the axis of the dipole and the magnetic field direction, in this 
case the z-direction. Classically, the magnetic moment could take any angle with respect 
to the magnetic field direction and so it would be expected that there would be a random 
distribution of deflection angles. If, however, space quantisation is a real physical phe- 
nomenon, the deflections would only take place at specific deflection angles 6. According 
to the Bohr-Sommerfeld model, an electron in orbit within an atom would have azimuthal 
angular momentum py = h/2z and would assume only three orientations with respect to 
the magnetic field direction given by 

Sets, (7.41) 

No 

Bohr had argued in 1918 that the case nı = 0 would be unstable because then the plane of 
the orbit of the electron would lie in the direction of the magnetic field and this he argued was 
an unstable configuration (Bohr, 1918b). Therefore, it was expected that the beam would 
only be split into two components corresponding to cos 6 = +1. Equally, the magnetic-core 
theory of fine-structure splitting by Sommerfeld and Lande predicted a splitting of the s- 


states of atoms into only two fine-structure lines with no central component, corresponding 
1 
F . 





ts==z 
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Classical 


prediction What was 


ilver at 
actually observed “Uver atoms 





Furnace 


Inhomogeneous 
magnetic field 


(a) (b) 
(a) A cross-section through the magnet used by Stern and Gerlach to create an inhomogeneous magnetic field 


distribution in the vertical z-direction (Gerlach and Stern, 1922a). (b) Illustrating the deflection of beams of silver 
atoms through the inhomogeneous magnetic field of the Stern—Gerlach experiment. 


The experiment proved to be a considerable challenge (Fig. 7.6). The deflection of the 
beam was expected to amount to only 





= (=) P 
s= 1.12 x 10 — cm, (7.42) 
oz 

where the field gradient is measured in gauss per centimetre, the temperature in kelvin and 
Lin cm. For a gradient of 10* G cm™! and a length through the magnetic field of 3 cm, the 
deflection of silver atoms at a temperature of 1000°C (1273 K) only amounted to about 
107? mm. By obtaining the support of various companies for a powerful electromagnet, 
for cryogens and vacuum pumps, they worked ceaselessly from the summer of 1921 to 
March 1922 on the experiment. Stern moved to the University of Rostock at the beginning 
of January 1922 and Gerlach carried on the experiments himself. In their original set-up, 
the narrow beam of silver atoms, which was produced by heating a platinum strip coated 
with silver to a temperature just above the boiling point of silver at 960 °C, was collimated 
by small circular holes (Fig. 7.65), but Gerlach hit upon the idea that it would be better 
to replace the second hole with a narrow slit and so increase the intensity of the beam 
and allow a ‘differential’ means of observing the splitting. By February 1922, Gerlach 
observed the splitting of the beam of silver atoms into two separate beams. Following 
further experiments, their paper was published in March 1922 (Gerlach and Stern, 1922a). 
In it, they wrote 


‘The atomic beam splits up into two distinct beams in a magnetic field. No undeflected 
atoms can be detected ... In these results we see direct experimental proof of the direc- 
tional [space-] quantisation in a magnetic field.’ 


The famous postcard showing the result of the experiment from Stern to Bohr is shown in 
Fig. 7.7. They also measured precisely the gradient of the magnetic flux density (0 B,/dz) 
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U pacletay han Parker, ander din Fr Ankit [rih 
Peltor: 1. Php Mil: Seite 110. ipei): Ie pnimentele hackirere 





A, pall an Mal, ; 
MALL, Au "u Ps 
Tanis! Mmh rhaith argola from, = 09 N 
Hu yeken Nunufedul_ 
The postcard sent by Stern to Bohr showing the results of the Stern—Gerlach experiment. The left image shows the 
rectangular beam in the absence of the magnetic field gradient; the right-hand image shows the splitting ofthe beam 
into two components. The greatest field gradient occurs in the centre of the beam. 


through which the beam of silver atoms passed and, during the Easter vacation, they went 
on to show that the magnetic moment of the silver atoms in the ground state was one Bohr 
magneton per gram-atom (Gerlach and Stern, 1922b). The importance of the result was not 
lost on the physics community. As Paschen wrote to Gerlach, 


“Your experiment proves for the first time the reality of Bohr’s stationary states.’ (Gerlach, 
1969) 


The Stern—Gerlach experiment is undoubtedly one of the great achievements of exper- 
imental physics, but it raised many issues about the fundamentals of quantum physics. 
Stern pointed out that, if space quantisation was a real physical effect, it should give rise 
to birefringence, or double refraction, in materials since there would be a difference in the 
refractive indices along and perpendicular to the magnetic field direction (Stern, 1921). 
Such an effect had never been observed. In addition, there were problems in understanding 
how the alignment of the orbits could come about according to classical physics. Einstein 
and Ehrenfest (1922) pointed out that the alignment of the magnetic moments of an initially 
random angular distribution would not take place instantaneously. Classically, the align- 
ment would involve the exchange of angular momentum between atoms and their estimate 
of the time-scale for this process was 10!! seconds, far longer than the 1074 seconds that it 
took the beam of atoms to pass through the magnetic field distribution of the magnets. It 
was inferred that there was something seriously wrong with the mechanical and dynamical 
laws at the atomic level. This was only the beginning of the difficulties with the old quantum 
theory — much worse was about to follow. 
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8.1 Bohr’s first model of the periodic table 


155 


We now need to retrace our steps and follow Bohr’s activities from his great Trilogy of 1913 
to his model for the periodic table of 1922. In 1914, Bohr petitioned the Danish government 
to create a professorship for him in theoretical physics and this was granted two years later. 
In the meantime, he returned to Manchester as Schuster Reader in physics before taking up 
his appointment as Professor of Theoretical Physics in Copenhagen in 1916. In 1917, Bohr 
successfully petitioned the physics faculty of Copenhagen University to found an Institute 
for Theoretical Physics with Bohr as its founding director. The Institute was officially opened 
in 1921, but the strain of setting up the new institute combined with his continuing, almost 
obsessive, research programme into the fundamentals of quantum theory, took a heavy toll 
and in 1921 he suffered a serious bout of ill-health. Despite this, he remained the driving 
force behind the attack on the problems of quantum theory on a very broad front. Thanks 
to his tireless efforts and inspiration, Copenhagen became one of the two major centres 
for the development of quantum theory, the other being in Göttingen, through the 1920s 
when the foundations of the old theory were to be cut away and replaced by completely 
new concepts. The Institute for Theoretical Physics, commonly referred to as the ‘Bohr 
Institute’, became formally the Niels Bohr Institute in 1965, three years after his death.! 

Already in the second paper of the Trilogy of 1913, Bohr aspired to account not just for 
the spectrum of hydrogen, but for the structure of all the atoms in the periodic table and 
their chemical properties (Bohr, 1913b). He had convinced himself that radioactive decay 
was associated with the nucleus of the Rutherford atom while the chemical properties were 
associated with the system of electrons orbiting the nucleus. His first efforts to define the 
electronic structure of atoms in the periodic table are engagingly told by Heilbron (1977). 
Heilbron summarises the four assumptions upon which Bohr made his first tentative steps 
into atomic structure: (1) all the electrons lie in circular orbits in the same plane through the 
nucleus; (ii) the populations of the innermost quantised rings increase with atomic number; 
(iii) each electron, no matter how far from the nucleus, has angular momentum h/2r in 
its ground state; (iv) the ground state is characterised by that which results in the lowest 
energy for the given total angular momentum. Bearing in mind the need to account for the 
periodic structure of the periodic table, he came up with an assignment of the numbers of 
electrons to different shells in atoms given in Table 8.1. 

Unlike the more theoretically minded physicists, Bohr lay considerable store in under- 
standing the chemical properties of the atoms, for example, their valences and the similarity 
of the properties of the elements in the different columns of the periodic table. The stability 
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Table 8.1 Bohr’ electron configurations for the first 24 elements of the periodic table (Bohr, 1913b). The 


numbers in brackets show the number of electrons in then = 1, 2, 3, .. . shells. 





1 H © 7 Nea 3) 13 Al (8,2,3) 19 K (8,8,2,1) 
2 He (2) 8 O (422) 14 Si (8,24 20 Ca (8,8,2,2) 
3 u @D 0 F (4,4,1) 15 P (8,4,3) 21 Sc (8,8,2,3) 
4 Be (22 10 Ne (8,2) 16 S (8,4,2,2) 22 Ti (8,8,2,4) 
5 B @ i u Ole 7 E GALi A w (8 643) 
6 C 24 12 Mg (8,2,2) 18 A (8,8,2) 24 Cr (8,8,4,2,2) 


of the circular orbits with increasing numbers of electrons also played a role in his thinking, 
following the hints provided by Nicholson’s analysis of the stability of rings of electrons 
to oscillations perpendicular to the orbital planes of the electrons (see Sect. 4.4). Since 
the electrons were all assumed to have circular orbits in the same plane, the outermost 
‘valence’ ring was shielded from the nuclear charge by the inner completed shells. As a 
result, the valence shells of similar elements such as the alkali metals, lithium, sodium and 
potassium were given the same outer electron structure, as did the alkaline earth metals, 
beryllium, magnesium, calcium. This model had little lasting appeal, but it is indicative of 
Bohr’s adventurous spirit in attempting to encompass a wide range of quantum and atomic 
phenomena in a single scheme in the context of his great innovations of 1913. The concept 
of shells of electrons was introduced, although Bohr was well aware of the provisional 
nature of his assignments of quantum numbers to the different elements. This was the be- 
ginning of the attempts by many theorists to create models of atoms including the quantum 
concepts introduced by Bohr. Notice that the electron orbits are entirely determined by a 
single quantum number, the principal quantum number n. As a result, this model is often 
referred to as a one-quantum structure. 

Following this foray into atomic structure, Bohr lay aside this aspect of his researches. 
As he wrote to Moseley in 1913, ‘For the present I have stopped speculating on atoms.’ 
Others including Ladenburg, Vegard, Langmuir and Bury proposed alternative schemes, 
but none of these gained general acceptance by the community of scientists. Bohr was 
profoundly impressed by Sommerfeld’s extension of the Bohr atom to elliptical orbits and 
the explanation of the fine-structure splitting of spectral lines as the effects of special 
relativity upon the orbits, which were described in Chaps.5 and 6. As Bohr wrote to 
Sommerfeld in March 1916, 


‘I thank you so much for your paper, which is so beautiful and interesting. I do not think 
that I have ever read anything which has given me so much pleasure.’ 


At the same time, the analyses of the Stark and Zeeman effects by Schwarzschild, Epstein, 
Sommerfeld and Debye greatly advanced understanding of the nature of quantum effects 
within atoms. Bohr had planned to write a comprehensive review of quantum phenomena 
in 1916 in an attempt to present the theory in a logically consistent manner, but he delayed 
publication in the light of these advances. In fact, he did not publish the results of his 
deliberations until 1918 when his important papers under the title On the quantum theory 
of line spectra surveyed all that was known about atomic spectra and also advocated 
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Ehrenfest’s adiabatic hypothesis (Sect. 5.6) as a guiding principle for the formulation of 
quantum theory (Bohr, 1918a,b). These papers contained the first formulation of Bohr’s 
correspondence principle as well as the selection rules for permitted quantum transitions 
(see Chap. 6). 

For the next few years, Bohr was largely preoccupied with the construction of the Institute 
of Theoretical Physics in Copenhagen and he published relatively few papers, but in 1921 
he was spurred into action by a letter to Nature by Norman R. Campbell which proposed 
that the static models of atoms and molecules developed by Lewis and Langmuir were ‘not 
really inconsistent’ with the Bohr-Sommerfeld model (Campbell, 1920). Bohr disagreed 
profoundly with this statement and went much further in advocating a model of atomic 
structure based strictly upon the quantum principles he had enunciated in his papers of 
1918 and the correspondence principle. In his words, 


‘Thus, by means of a closer examination of the progress of the binding process this 
principle offers a simple argument for concluding that these electrons are arranged in 
groups in a way which reflects the periods exhibited by the chemical properties of the 
elements within a sequence of increasing numbers. In fact, if we consider the binding of a 
large number of electrons by a nucleus of higher positive charge, this argument suggests 
that after the first two electrons are bound in one-quantum orbits, the next eight electrons 
will be bound in two-quantum orbits, the next eighteen in three-quantum orbits, the next 
thirty two in four-quantum orbits.’ (Bohr, 1921a) 


This announcement generated a great deal of interest in the physics and chemistry com- 
munity which was anxious to hear about Bohr’s apparent solution of the problem of under- 
standing atomic structure and the nature of the periodic table of the elements. Bohr refined 
his proposals in a further letter to Nature later that year (Bohr, 1921b). The opportunity to 
give a complete picture of his thinking about quantum physics and the theory of the periodic 
table came in 1922 with the invitation to deliver the Wolfskehl lectures at Göttingen, which 
were to have a lasting impact upon the development of quantum theory. 


8.2 The Wolfskehl lectures and Bohr’s second 


theory of the periodic table 
ey 


The mathematician Paul Wolfskehl died in 1906 and bequeathed 100,000 Marks to the 
Königliche Gesellschaft der Wissenschaften of Göttingen as a prize for the first person to 
provide a complete proof of Fermat’s Last Theorem,’ the statement that there are no integers 
x, y, z and n which satisfy the equation x” + y” = z” for x, y, z # O andn > 2. Until the 
prize was awarded, the interest on the bequest should be used by the Göttingen Academy to 
advance the mathematical sciences. Under Hilbert’s direction, an annual series of lectures 
was set up to attract distinguished scientists in the areas of mathematics and physics to 
deliver a series of lectures on frontier topics in these disciplines. The first set of lectures 
was delivered by Poincaré in 1909, subsequent lecturers including Lorentz, Sommerfeld, 
Einstein, von Smoluchowski, Mie and Planck. The invitation to Bohr was sent in November 
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718 — 


The periodic table as presented by Thomsen which was adopted by Bohr in his analysis of the electronic configuration 
of atoms (Thomsen, 1895a,b). 


1920 and, once his health had recovered, he delivered his seven lectures during the period 
12-22 June 1922. The lectures attracted a galaxy of those working on the key problems of 
quantum physics from all over Germany, as well as Ehrenfest from Leiden and Klein and 
Oseen from Bohr’s Institute in Copenhagen. Sommerfeld brought with him from Munich 
the 20-year old Werner Heisenberg, while 22-year old Wolfgang Pauli came from Hamburg. 
The audience totalled about 100 physicists and mathematicians. Bohr took the occasion 
very seriously and used it to present his personal survey of the entire field of quantum and 
atomic physics. Soon, the event became known as the Bohr Festspiele, or ‘Bohr Festival’ 
and was to have a major influence on the future directions which those present took to the 
study of quantum physics. 

The contents of the lectures are described in the fourth volume of Bohr’s collected works 
(Bohr, 1977). The first three lectures on the 12, 13 and 14 June 1922 concerned: Lecture 1: 
Review of the history of quanta and quantisation and the Bohr model of the atom of 1913; 
Lecture 2: The Bohr-Sommerfeld model of the atom, Ehrenfest’s adiabatic hypothesis, 
multiply periodic systems and the relativistic theory of electron orbits; Lecture 3: The 
interpretation of spectra, the Zeeman and Stark effects, the correspondence principle, space 
quantisation and the Stern—Gerlach experiment. These topics have been discussed in earlier 
chapters. Lectures 4, 5, 6 and 7 were devoted to Bohr’s interpretation of the periodic table 
in atomic terms. In this analysis, he used the form of the periodic table presented by his 
compatriot, Julius Thomsen, shown in Fig. 8.1, in which the lines join elements with similar 
chemical properties (Thomsen, 1895a,b). The important innovation of Thomsen’s scheme, 
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O*tm 
HYDROGEN (1) 


HELIUM (2) 





NEON (10) SODIUM) 


Bohr’s models for the orbits of electrons in (a) hydrogen, (b) helium, (c) lithium, (d) neon and (e) sodium atoms. In the 
original colour versions of this diagram, Fig. 8.4 and Fig. 8.5, odd principal quantum numbersn = 1, 3, 5,...are 
coloured red and even values n = 2, 4, 6, . . . black (Kramers and Holst, 1923). 


as compared with that of Mendeleyev (Fig. 1.2), was the identification of the noble gases, 
helium (2), neon (10), argon (18), krypton (36) and xenon (54) as elements of zero valence 
which completed the various periods of the periodic table. As expressed by Thorpe in his 
History of Chemistry, Vol. 2. From 1850 to 1910, 


“The valency of such an element would be zero, and therefore in this respect also it would 
represent a transitional stage in the passage from the univalent electronegative elements 
of the seventh to the univalent electropositive elements of the first group.’ (Thorpe, 1910) 


Lecture 4 was devoted to the first period of the table, the atoms hydrogen and helium. 
The hydrogen atom was in excellent agreement with the Bohr-Sommerfeld theory and 
was illustrated in the first of a beautiful set of plates prepared specially for the occasion 
by Bohr (Fig. 8.2a). These diagrams use a notation in which the state of the electron is 
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An example of a penetrating orbit from Schrodinger’s paper of 1921 (Schrodinger, 1921). The circle represents the 
atomic core while the ellipses illustrate the trajectories outside and inside the core. Within the core, the electron feels 
the full force of the nuclear charge while outside the core the nuclear charge is shielded by the inner electrons. 


indicated by the principal quantum number n and the azimuthal quantum number k and 
written n;. If there are x electrons in the state n, the designation (n;), is used. We recall 
that, according to the Bohr-Sommerfeld model of the atom, the orbits are circular if k = n 
and the orbits with the greatest eccentricity have k = 1 if k 4 n. Helium proved to be much 
more problematic and most of the lecture was devoted to describing these difficulties. The 
theory had to account for the fact that two separate systems of lines appeared in the helium 
spectrum, what were termed parahelium, which consisted of singlet lines, and orthohelium 
which consisted to triplet lines (see Fig. 16.1). Bohr and Kramers had wrestled with the 
problems of understanding the dynamics and stability of the helium atom and eventually 
adopted a model consisting of two electrons in separate circular orbits about the doubly 
charged nucleus. Their stability arguments led them to the conclusion that the electrons 
could not occupy the same plane, but rather that the planes of the two orbits had to be 
inclined to each other as illustrated in Fig. 8.2b. Note that the size of the helium atom is 
smaller than the hydrogen atom because of its doubly charged nucleus. Unlike the case of 
hydrogen in which the ionisation potential and the energy levels were precisely predicted 
by the quantum theory, satisfactory agreement could not be achieved for the helium atom. 

Undaunted, Bohr proceeded to the rest of the periodic table in Lecture 5 and 6. There were 
two important influences on his thinking. The first was what he called his Aufbauprinzip, 
or ‘building-up principle’, in which the atomic structure of successive elements is obtained 
by adding an electron to the pre-existing structure of the previous element in the periodic 
table, noting at the same time that the charge on the nucleus increased, providing stronger 
binding to the nucleus. The second concept was the use of penetrating orbits to account 
for the valence electrons. The elliptical orbits of the Bohr-Sommerfeld model now came 
into their own, particularly thanks to a foray into atomic structure by Arnold Schrédinger. 
In his paper of 1921, he solved the problem of the dynamics of an electron in a quasi- 
elliptical orbit which penetrates within the atomic core of the inner electrons (Fig. 8.3) 
(Schrédinger, 1921). In the outer region, the electron experiences the electrostatic force 
of a single electronic charge, while inside the circular inner orbit, it feels the full force 
of the nucleus. The small ellipse embedded within the circular orbit shows the resulting 
kinematics. In addition, Schrédinger was able to account for the Rydberg formula of the 
alkali metals, the reasoning being similar to the arguments described in Sect. 6.6. 
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Bohr argued that the third electron in lithium could not be in a 1, orbit because it would 
be too tightly bound to the nucleus. The solution was to place it in the next shell as a 2, 
elliptical orbit about the helium nucleus, as illustrated in Fig. 8.2c. Notice the distortions 
of the ellipse close to the ‘helium core’ and the consequent precession of the elliptical 
orbit about the nucleus. The 2; orbit of the electron is evidently less tightly bound to the 
nucleus and becomes the valence electron of lithium. Bohr placed the next electron in a 
2, orbit to create the beryllium atom with double valence. Proceeding beyond boron, Bohr 
found himself in uncertain waters. The one fixed point was that, by the time he reached the 
next noble gas neon, the structure should be stable with what we would now call a ‘closed 
shell’. He achieved this by filling up the n = 2 state with 2, and 2, orbits, placing four 
electrons in each and so accounting for the atomic number A = 10. The structure he had 
in mind is shown in Fig. 8.2d. The diagram is to be interpreted as four 2; orbits and four 
2, orbits in three dimensions as explained by Kramers and Helge Holst in their popular 
book (Kramers and Holst, 1923). Bohr’s dilemma is indicated in the published form of his 
electron assignments in the periodic table of Table 8.2 in which boron is tentatively given 
a 2, electron and the filling of the remaining electrons to the shell left uncertain. 

With then = 2 shell filled, Bohr immediately assigned the eleventh electron to a 3, orbit 
which gave it a similar valence role to that of lithium (Fig. 8.2e) while the twelfth electron 
was also assigned a 3; orbit corresponding to the magnesium atom. In filling up the rest 
of the n = 3 shell, there were now elliptical 35 orbits available and Bohr proposed that the 
completion of the n = 3 shell at argon should consist of four electrons with 3, orbits and 
four with 32 orbits (Fig. 8.4a). The next step was to continue to the n = 4 shell and it was 
straightforward to assign one 41 electron to potassium and a second 4; electron to calcium. 
However, now he had to incorporate the elements from scandium to nickel. These elements 
have similar chemical properties and this could be explained if the inner n = 3 shell was 
completed, including the circular 33 orbits, rather than electrons being added to the n = 4 
level. Bohr advanced arguments that, in filling up the n = 3 levels rather than proceeding 
with then = 4 levels, these atoms would be in a lower state of total energy. The building up 
of the elements proceeded in this way, with closed shells at krypton (Fig. 8.4c) and xenon 
(Fig. 8.4d). The rare earth elements from cerium (58) to ytterbium (70) were associated 
with the completion of inner shells with the result that their chemical properties would be 
very similar. In his Nobel lecture of 1922, Bohr went so far as to state that 


“Indeed, it is scarcely an exaggeration to say that if the existence of the rare earths had 
not been established by direct experimental investigation, the occurrence of a family of 
elements of this character within the sixth period of the natural system of the elements 
might have been theoretically predicted.’ (Bohr, 1922) 


Bohr proceeded to assign electrons to periods up to n = 7, his sketch of the radium 
atom, drawn at twice the scale of the earlier diagrams, showing clearly the ‘shell’ structure 
of the electron distributions (Fig. 8.5). He even suggested that the unknown element with 
A = 118 should be stable. 

Bohr implied that his assignments of the electrons to different shells were based upon 
detailed calculations, including application of the correspondence principle to atomic struc- 
ture. In fact, it seems that the assignments were primarily based upon symmetry, the need 
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Table 8.2 Bohr’s model of 1922 for the distribution of electrons in atoms of the periodic table. 





1, 2; 2. 3; 32 33 4: 4. 45 44 5: So 53 54 55 6; 62 63 64 65 66 71 7 





— 
i 
I 





=æ 





(2) 


=> 
N 
— 





= 





2 
(2) 


= 
N 
— 





= 





— 
= 

N 
S 


DDDDD a DAD RABE > 
= 
~ 
N 
— 


DADDD a DDD Ne 


Do 

i= 
= 

N 
— 





o0 00 
o0 00 








DADD A DAD RR RRB HHA > 





è 
N 
N NNNN N NNN NN NNNNN N NNN NNNN N NNN NNNN N NNN N NNN Deo 
> AAAA > AAA AA AAAAA A AAA AAAA > RRA AAAA e ARAB > NNR 
> AAAA > RAH RR KRRBHHA > RRA AAAA > RRA AAAA A AAA > 
a DADDD A DAD ano DADADD DW DAW NDADDR DW DAW AAAA A Do 
a DADD BD DAD ao NNNNO DW DAD DADD DW DDD AAAA > 


a DADA a DAD Na NND a DAD DADD a DAD Ne 
oo 00 00 00 00 oo 00 00 00 co 00 DADDD lon DDD AAAA > NNR 


| ADAH AL ADD| BAR] BPHRRABR BY] IND 


o0 00 00 00 00 o0 00 0000 o0 00 
o0 
oo NNNO a DDD Do 
S j| SS >j = 
= | Nem 

AE 

2 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:53:34 GMT 2014. 
http://dx.doi.org/10.1017/CB09781139062060.009 
Cambridge Books Online © Cambridge University Press, 2014 





163 


8.2 The Wolfskehl lectures 





KRYPTON 66) 








COPPER (29) XENON (54) 


Bohr’s models for the orbits of electrons in (a) argon, (b) copper, (c) krypton and (d) xenon atoms. In the original colour 
version of this diagram, odd principal quantum numbers n = 1, 3, 5, . . . are coloured red and even values 
n = 2,4, 6, ... black (Kramers and Holst, 1923). 


to account for the similarities of the chemical elements, his deep pondering about how the 
quantum concepts associated with the Bohr-Sommerfeld model could be extended to the 
complete periodic table, and intuition. Bohr’s closest collaborator, Hans Kramers, proba- 
bly had a better understanding of Bohr’s meaning of the correspondence principle and its 
applications than anyone. In his popular book with Holst, he emphasised the provisional 
nature of the assignments of the electron distributions and of the arguments which led to 
these (Kramers and Holst, 1923). As remarked by Kragh, no evidence has been found that 
Bohr’s conclusions were based upon detailed mathematical calculations (Kragh, 1985). 
Bohr’s assignments of electronic structures to the elements had a major success in 
predicting the properties of the unknown element with atomic number A = 72. According 
to his scheme, it should have properties similar to zirconium with A = 40 (see Table 8.2). 
A spanner was thrown in the works with the claim that an element with A = 72 had been 
discovered by the French scientists in a sample of rare earths which they interpreted as a 
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RADIUM (68) 





STRUCTURE OF THE RADIUM ATOM 


Bohr’s model of the radium atom drawn on twice the scale of the diagrams shown in Figs. 8.2 and 8.4 (Kramers and 
Holst, 1923). In the original colour version of this diagram, odd principal quantum numbers, n = 1, 3,5... are 
coloured red and even valuesn = 2, 4,6, . . . black. 


new element which they called celtium. They inferred that the element should belong to 
the rare earth group of lanthanides, in conflict with Bohr’s assignment shown in Table 8.2. 
Bohr eventually persuaded Coster to examine ores of zirconium and found that indeed the 
element with A = 72 was present with high abundance in all such samples. The element 
was named hafnium, after the Latin name for Copenhagen, Hafnia. Bohr announced this 
discovery in his Nobel Prize lecture of 1922.4 

Bohr’s lectures had a major impact, particularly upon the younger participants. Although 
provisional, Bohr’s model for the periodic table was an attempt using plausible quantum 
physics to put order into many different aspects of the periodic table of the chemical 
elements. As remarked later by Pauli, 


‘It made a strong impression upon me that Bohr at that time and in later discussions was 
looking for a general explanation that should hold for the closing of every electron shell 
and in which the number 2 was considered as essential as 8.’ (Pauli, 1964) 
The Bohr model provided a framework for many different attacks upon the problems of the 
quantum physics of atoms and molecules. 


8.3 X-ray levels and Stoner’s revised periodic table 
FF 


It is striking that Bohr had been able to achieve so much employing only two quantum 
numbers, the principal quantum number n and the azimuthal quantum number k. There is 
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no mention of the inner quantum number, j or J, introduced by Sommerfeld and Lande 
to account for the anomalous Zeeman effect. Most of Bohr’s attention had concentrated 
upon the interpretation of optical data and the valencies of the chemical elements which 
are associated with the outer electronic structures of atoms. In contrast, X-ray emission 
lines and X-ray absorption edges provided direct information about the innermost shells of 
atoms. Bohr discussed the importance of the X-ray measurements in his seventh and final 
Wolfskehl lecture, but no mention is made of the inner quantum number. 

In Moseley’s pioneering X-ray experiments of 1913 and 1914, the K and L X-ray lines 
were used to order the chemical elements according to their atomic numbers, as discussed 
in Sect. 4.6 (Moseley, 1913, 1914). The interpretation of Moseley’s data in terms of the 
Bohr model was taken up by Kossel (1914, 1916). He reasoned that, when an electron is 
ejected from the innermost n = | stationary state of a Bohr atom, this vacancy is refilled 
by an electron jumping from a higher energy state, in the process emitting a photon with 
energy equal to the difference in energies of the initial and final states. It was thus expected 
that there would be a series of lines associated with transitions originating from the n = 2, 
n = 3,...states. This model accounted for the fact that the characteristic X-ray emission 
lines were always observed in emission and not in absorption. In Kossel’s interpretation, 
the X and L shells corresponded to final states with principal quantum numbers n = 1 
and 2. Although not stated explicitly, Kossel assumed that there is a maximum number of 
electrons which could occupy each shell which becomes particularly stable when it is filled 
with the maximum number of electrons. 

X-ray spectroscopy techniques developed dramatically over the succeeding years,’ the 
spectral resolution of the measurements increasing by a factor of at least 100. A review of 
these early developments is presented by Manne Siegbahn (1962), one of the pioneers of 
precision X-ray spectroscopy, in which he pointed out that the X-ray spectroscopists had 
access to wavelengths in the range 0.1-20 A, far exceeding the range available to optical 
spectroscopists. The high spectral resolution enabled details of the various X-ray multiplets 
to be studied in detail. The clustering of X-ray lines into K and L shells is illustrated by the 
X-ray spectrum of silver shown in Fig. 8.6a. Although there is a great deal of fine structure, 
the separation into K and L shells is clear. In addition to these emission spectra, the X-rays 
are absorbed by atoms, but rather than emission lines, absorption edges were observed, as 
illustrated by the variation of the absorption coefficient of X-rays by silver atoms (Fig. 8.65). 
The wavelengths of the absorption edges are shorter than those of the emission lines of 
the same series. This was attributed to the fact that absorption of X-radiation from, say, 
the K level requires the photons to have energy € > Ex, the ionisation energy from the K 
shell. 

It can be seen from Fig. 8.65 that the K series is a singlet, in that there is only a single 
absorption edge, whereas the L series has three absorption edges. Proceeding to higher 
levels for elements with greater atomic numbers, it was found that the M series had five 
edges and N seven. These results found a natural interpretation as multiplets of the different 
shells of the atoms. Whilst a given level had the same principal quantum number n, different 
sublevels would be associated with the different allowed values of k. This interpretation 
was, however, in conflict with Bohr’s assignments of the number of sublevels in the Z shells 
which were only defined by the two quantum numbers n = 2 and k = 2 and 1. There are only 
two possible combinations of these quantum numbers (n = 2, k = 2) and (n = 2, k = 1) 
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Table 8.3 Landé’s inferred distribution of the quantum numbers among the K, Land M X-ray levels 
(Lande, 1923b), including the equivalent optical terms derived from optical spectroscopy (Stoner, 1924). 





Note that, for consistency of notation, J is used for the inner quantum number rather than Stoner’s j. 
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& 6 (a) A schematic diagram showing the X-ray emission line spectrum of silver, clearly separating the K from the L series 


of lines. (b) The absorption coefficient of silver as a function of wavelength, showing the X-ray edges associated with 
absorption from the X and L levels (Semat, 1962). 


which would have different energies since the former is a circular orbit and the latter an 
ellipse. This conflicted with the observation that the Z state is a triplet and correspondingly 
the M series is a quintuplet rather than a triplet. Evidently, an additional quantum number 
was needed and this was provided by the inner quantum number J. Landé proposed the 
distribution of electrons among the X, L and M levels shown in Table 8.3. Accompanying 
these assignments was a set of selection rules in which k would change by 1 and J by 1 
or 0. 

In 1924 Table 8.3 was used by Edmund Stoner, then in the final year of his postgraduate 
studies with Rutherford and Fowler at Cambridge, to make the next major advance in 
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ment | number 















































Stoner’s modification of Bohr’s distribution of electrons in the shells of the inert gases (Stoner, 1924). The numbers of 
electrons in a given atom is indicated by all the numbers above and to the left of the heavy line drawn under each 
atom’s symbol (Heilbron, 1977). 


defining the electronic structures of atoms. He was well aware of the difficulties of the Bohr 
scheme and favoured Landé’s scheme. He noted in addition that there is a strong analogy 
with the inferred structure of the atom based upon Landé’s interpretation of the anomalous 
Zeeman effect. The X-ray levels could be related to the optical terms inferred from the 
analysis of optical spectra. The last line of Table 8.3 shows the corresponding optical terms 
for the doublet terms of the alkali metals in Stoner’s notation. In his paper, Stoner (1924) 
wrote: 


‘The number of electrons in each completed level is equal to double the sum of the inner 
quantum numbers as assigned, there being in the K, L, M, N levels, when completed, 
2,8(=2+2+4), 18(=2+2+4+4+ 6), ... electrons. It is suggested that the num- 
ber of electrons associated with each sub-level separately is also equal to double the inner 
quantum number.’ 


Thus, according to Stoner’s arithmetic and referring to Table 8.3, 


The K level contains 2 x 1 = 2 electrons; 

The L; sublevel contains 2 x 1 = 2 electrons; 

The Ly sublevel contains 2 x 1 = 2 electrons; 
The Ly sublevel contains 2 x 2 = 4 electrons. 


In this way, Stoner built up the periodic table, Fig. 8.7 showing the distribution of electrons 
for the closed shells of the inert gases. The sublevels of a given shell, indicated by roman 
capital numerals, are characterised by the azimuthal quantum number k and the inner 
quantum number J which takes values k and k — 1. Therefore, in Stoner’s scheme, the 
maximum occupancy of a k shell is 2 x [k + (k — 1)] = 4k — 2. For a given value of n, 
this resulted in the filling of the sublevels with 2 (k = 1), 6 (k = 2), 10 (k = 3), 14 (k = 4), 
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18 (k = 5), ... The number of electrons in a completely filled shell with principal quantum 
number n is 2, 8, 18, 32, 50 . . . „in other words 277. The total numbers of electrons in a given 
atom is indicated by all the numbers above and to the left of the heavy line drawn under 
each atom’s symbol. As compared with Bohr’s scheme, there was a greater concentration 
of electrons in the outer subgroups and the closure of the inner subgroups at earlier stages. 
Stoner’s scheme was immediately adopted by Sommerfeld who remarked: 


‘Being based on the incontestable experience as to the number and order of X-ray levels, 
and on the association of quantum numbers with these, Stoner’s scheme is much more 
trustworthy than Bohr’s. It has an arithmetic rather than geometric-mechanical character; 
without assuming any symmetry of orbits it exploits not some, but all available data of 
X-ray spectroscopy.’ (Sommerfeld, 1925) 


In his memoirs, Stoner wrote: 


‘Probably no other single paper of mine has attracted so much attention... . It is of interest 
to note, however, that an explicit statement is effectively made of what later became known 
as the Pauli exclusion principle, though it is presented more as having been arrived at 
inductively from experimental findings rather than as a basic axiom for a deductive 
treatment of electron distribution as in Pauli’s paper. . . . ° (Bates, 1969) 


8.4 Pauli’s exclusion principle 
ee” 


Stoner’s insights impressed Wolfgang Pauli, in particular his statement: 


“Twice the inner quantum number does give the observed term multiplicity as revealed by 
the spectra in a weak magnetic field . . . In other words, the number of possible states of the 
(core plus electron) system is equal to twice the inner quantum number, these 2J states 
being always possible and equally probable, but only manifesting themselves separately 
in the presence of the external field.’ (Stoner, 1924) 


Pauli inferred that a natural explanation of the shell structure of the atom was to postulate 
that the state of electrons in atoms is determined by four quantum numbers, n the principal 
quantum number, k the azimuthal quantum number, J the inner quantum number and m, the 
component of angular momentum in the direction of an applied field, where —J < m < J, 
provided only one electron is allowed to occupy each of these states. These rules enabled 
Pauli to recover Stoner’s results described in the last section. But the implications were 
much deeper and were to have profound implications for the future development of quantum 
mechanics. Pauli had enunciated the principle of exclusion, according to which within an 
atom not more than one electron can occupy a stationary state with a single set of quantum 
numbers, the Pauli exclusion principle. In his words, 


“There never exist two or more equivalent electrons in an atom which, in strong magnetic 
fields, agree in all quantum numbers n, k, J and m. If there exists in the atom an electron 
for which these quantum numbers (in the external field) have definite values, this state is 
“occupied”.’ (Pauli, 1925) 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:53:34 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781 139062060.009 
Cambridge Books Online © Cambridge University Press, 2014 





169 


8.5 The spin of the electron 


Pauli was impressed by the fact that the exclusion principle could account for the absence 
of certain states in, for example, the alkaline earth metals such as calcium, which would 
involve two electrons with the same quantum numbers. The physical significance of the 
internal quantum number J and the origin of the exclusion principle itself were, however, 
obscure. Pauli associated the quantum number J with the valance electron itself rather than 
with the core and, in the limit of strong magnetic fields when the Paschen-Back effect 
dominates, he showed that J behaves like orbital angular momentum. He referred to the 
property of the quantum number J of the valance electron as ‘classically not-describable 
two-valuedness’. He was very close to discovering the spin of the electron and its magnetic 
moment, but drew back from that final step.° 


8.5 The spin of the electron 


The discovery of the spin and magnetic moment of the electron by Samuel Goudsmit 
and George Uhlenbeck has an amusing history. The concept of the spin of the electron 
was first explored by Ralph Kronig in 1925 while he held a travelling fellowship from 
Columbia University. During his visit to Tübingen, then the centre of spectroscopic studies 
of atoms and molecules, Landé showed him a letter from Pauli describing his work on 
the exclusion principle and the need for four quantum numbers. Kronig hit upon the idea 
that the J quantum number was associated with the intrinsic spin of the electron. If the 
orbits of the electrons resembled a planetary system, he reasoned that the electron itself 
would be spinning, just like the planets. He then associated a magnetic moment of one Bohr 
magneton with the spinning electron. This concept immediately suggested an origin for the 
splitting of the D lines of sodium into D; and D} components. The electron experiences 
a magnetic field B because of its motion through the electric field of the nucleus of 
magnitude B = (1/c*)(v x E). Consequently, there is an interaction energy associated 
with the magnetic moment of the electron and the induced magnetic field of magnitude 
H : B. Because the J quantum number took up two orientations with respect to the magnetic 
field direction, their interaction energies were different and gave excellent agreement with 
the observed splitting of the sodium D lines. Furthermore, this model was also in full 
agreement with Landé’s semi-empirical ‘relativistic splitting rule’. 

There were however concerns. Kronig went on to apply the magnetic coupling argument to 
hydrogen atoms and found an answer similar to that found by Sommerfeld in his relativistic 
treatment of the hydrogen atom. The worry was that Sommerfeld’s model accounted very 
well for the observed fine-structure splitting of hydrogen and so there would have to be some 
remarkable fine tuning if the combination of effects were to result in the observed splitting. 
In fact, Kronig was always a factor of 2 out in obtaining the required compensation. There 
was another concern which was that the velocity of the surface of a classical electron would 
be rotating at a speed far exceeding the speed of light. This can be demonstrated by a simple 
classical calculation. The Stern—Gerlach experiment was interpreted as evidence that the 
angular momentum of the valence electron is h/2. Therefore, the speed of rotation of the 
surface of the electron would be given by the relation mur = h/2. Using the expression 
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for the classical electron radius found by equating the rest mass energy of the electron 
to its electrostatic potential energy r = e?/Arregm.c?, v = c/2«, where the fine-structure 
constant a = e*/2€ yh ~ 1/137. Thus, the speed of rotation of the surface of the electron 
would far exceed the speed of light. Kronig discussed these ideas with Pauli, Kramers and 
Heisenberg who were strongly critical of the proposal, Pauli being particularly dismissive. 
Kronig let the matter rest. 

After the publication of Pauli’s paper on the exclusion principle, the idea was revived 
by Uhlenbeck and Goudsmit (1925a). They were impressed by two considerations. The 
first, pointed out to them by Ehrenfest, was the calculation by Abraham that if the charge 
of a rotating sphere is distributed over its surface, the magnetic moment is twice that 
if the charge is uniformly distributed through the sphere (Abraham, 1903). The second 
was the fact that Pauli’s paper required four quantum numbers to describe the motion 
of the electron. Since they equated the number of quantum numbers with the number 
of degrees of freedom, they could associate three of these with the kinematics of the 
motion of the electron, but what about the fourth? Their guess was that it was associated 
with the intrinsic angular momentum of the electron, rather than with that of the core. 
They were well aware that a consequence would be that the rotation speed of the surface 
of the electron would far exceed the speed of light. Pauli’s ‘classically not-describable 
two-valuedness’ was to be associated with two allowed orientations of the spin axis of 
the electron with respect to any chosen axis. Finally, they proposed that the ratio of the 
magnetic moment of the electron to its angular momentum was twice that associated with 
the electron’s orbital motion. Ehrenfest encouraged them to write a short paper for Die 
Naturwissenschaften and suggested that they should send it to Lorentz for comments. The 
following week they received a long response from Lorentz on the properties of rotating 
electrons. It revealed further flaws with the picture, including the fact that the magnetic 
energy would far exceed the rest mass energy of the electron and in fact would be greater 
than that of the proton. Uhlenbeck and Goudsmit decided against publication, but Ehrenfest 
had already sent the paper off to Die Naturwissenschaften in which it was published on 
20 November 1925. 

Reaction to the paper was mixed, Bohr welcoming the concept referring to it as ‘a very 
welcome supplement to our ideas of atomic structure’ as an addendum to the version of 
Uhlenbeck and Goudsmit’s paper which appeared in Nature (Uhlenbeck and Goudsmit, 
1925b). Kronig reiterated his concerns about the concept of electron spin and introduced 
a new problem, the effect of the spins of electrons which were assumed to be present in 
the nucleus (Kronig, 1925). In 1926, however, the consensus of opinion changed when 
the factor of 2 problem for the splitting of the fine structure of hydrogen was solved 
by Llewellyn Thomas (1926), who showed that a second-order relativistic term had been 
neglected in working out the magnitude of the spin—orbit coupling of the spinning electron 
about the nucleus, the effect referred to as Thomas precession.’ Specifically, the energy 
shift associated with the interaction between the magnetic moment of the electron and 
the magnetic flux density B which it experiences in its instantaneous rest frame is 


2ug 10U 
AE =p: B= Ors, (8.1) 


hmeec?r ðr 
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where U(r) is the potential energy of the electron in the field of the nucleus and the 
vectors L and S are the orbital and spin angular momentum vectors respectively. The 
corresponding Thomas precession term, which takes account of time dilation between 
the frames of reference of the orbiting electron and the external frame, amounts to 


up 10U(r) 


hm,ec? ror 





AE= L.-S, (8.2) 
that is, exactly half the magnetic spin-orbit interaction and with the opposite sign. This 
calculation persuaded Pauli that electron spin had to be taken really seriously and was to 
lead to his introduction of Pauli spin matrices in the following year. 

The problems of understanding electron spin uncovered by the pioneers of quantum 
mechanics were well-founded. It turned out that the concept was very effective in under- 
standing the features of atomic spectra and could be readily incorporated into Lande’s 
vector model of the atom. It was emphasised by Pauli, however, that electron spin has the 
property of ‘classically not-describable two-valuedness’ — spin is an intrinsic property of 
the electron itself. It is not an angular momentum in the sense of classical mechanics, but 
rather, as Pauli expressed it, ‘an essentially quantum-mechanical property’. This can be 
appreciated from the fact that it vanishes if h — 0. We will return to these features of spin 
in Sect. 16.6. 
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Despite Einstein’s advocacy and Millikan’s dramatic verification of Einstein’s expression 
for the relation between the frequency of the incident radiation and the stopping voltage 
of the photoelectrons in the photoelectric effect (Sect. 3.7), the concept of light quanta was 
not taken seriously by the majority of physicists. As expressed by Mehra and Rechenberg,! 


‘... the large majority of physicists working on quantum theory ... subscribed to what 
Planck, Nernst, Rubens and Warburg wrote in 1913 about Einstein: “that he may have 
sometimes missed the target in his speculations as, for example, in his theory of light- 
quanta.” And during the following decade hardly anyone took light-quanta seriously until 
there appeared a paper to settle this question completely: this was the paper presented by 
Arthur Holly Compton in December, 1922.’ (Compton, 1923) 


Compton demonstrated that the law of scattering of energetic X-rays by electrons in atoms 
follows precisely from the assumption that the quanta of light have energy ¢ = hv and 
momentum p = (hv/c)i;. This was irrefutable evidence for the wave-particle duality 
which was to play a central role in the development of quantum theory. 


9.1 The Compton effect 


172 


The story begins with Thomson’s classical analysis of the scattering of radiation by free 
electrons which was described in Sect. 4.3.1. The cross-section for Thomson scattering is 


e* Sur? 


= Te _ 6.653 x 107? m? , (9.1) 


6nesm2c* 3 





OT 


where re = e?/4rregm.c? is the classical electron radius. The angular distribution of the 
radiation is given by the expression for the differential cross-section for Thomson scattering 


2 
dor = za + cos?a) dQ, (9.2) 


where « is the angle between the incident beam and the direction of the scattered radiation. 
Integrating (9.2) over solid angle gives (9.1).” Barkla and Ayres showed that this formula 
was a good description of the scattering of soft X-rays by electrons (Barkla and Ayres, 
1911). For more energetic X-rays and y-rays, however, the agreement was not so good and 
in particular, the scattered X- and y-rays were of lower energy than the incident radiation 
(Gray, 1913). Gray repeated these experiment in 1920, again estimating the energies of the 
scattered X-rays by their absorption properties. He concluded that: 
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9.1 The Compton effect 











[| 
I 
h 
À AA A A 
The results of Compton's X-ray scattering experiments showing the increase in wavelength of the Ky line of 
molybdenum as the deflection angle @ increases (Compton, 1923). The unscattered wavelength A of the X, line is 
0.7107 A (0.07107 nm). 


“the results we have obtained would be explained if we could always look on a beam of X- 
rays or gamma-rays as a mixture of waves of definite frequencies, and if rays of a definite 
frequency were altered in wavelength during the process of scattering, the wavelength 
increasing with the angle of scattering.’ (Gray, 1920) 


With the invention of the X-ray spectrograph by the Braggs, the tools were available for 
more precise studies of the scattering of high energy X-rays (Bragg and Bragg, 1913a,b). 

In 1921, the challenge was taken up by Compton who was the first investigator to use a 
Bragg spectrometer in combination with a ‘recording device’ to analyse the wavelengths 
of the scattered radiation. He soon confirmed the result that the scattered wavelength was 
greater than the incident wavelength, stating, 


‘in addition to scattered radiation, there appeared in the secondary rays a type of fluorescent 
radiation, whose wavelength was nearly independent of the substance used as the radiator, 
depending only upon the wavelength of the incident rays and the angle at which the 
secondary rays are examined.’ (Compton, 1922) 


From the beginning the challenge was to understand the mechanism by which the wavelength 
of the radiation increased on scattering. Compton examined what would be expected if ‘each 
quantum of X-ray energy were concentrated in a single particle and would act as a unit on 
a single electron’. The result of his calculations was his famous formula for the change of 
wavelength of the X-rays as a function of scattering angle 0, 





Aà = ( 2 ) (1-cos®). (9.3) 
MeC 


In his classic experiment, he used the K, line of molybdenum as the primary radiation 
and graphite as the scatterer. The results were reported to the American Physical Society in 
April 1923 and published a month later in the Physical Review (Compton, 1923). In fact, the 
same relation had been worked out by Debye and submitted to the Physikalische Zeitschrift 
only a month before Compton’s presentation (Debye, 1923). The change in wavelength 
and the intensity of the radiation as a function of scattering angle @ were in excellent 
agreement with the expectation of Einstein’s light quantum hypothesis (Fig. 9.1). Initially, 
not all the experimenters could reproduce Compton’s results, but by the end of 1924, there 
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was general agreement that these held good and that the particle nature of light quanta was 
securely established. In June 1929, Werner Heisenberg wrote in a review article, entitled 
The development of the quantum theory 1918-1928, 


‘At this time [1923] experiment came to the aid of theory with a discovery which would 
later become of great significance for the development of quantum theory. Compton found 
that with the scattering of X-rays from free electrons, the wavelength of the scattered rays 
was measurably longer than that of the incident light. This effect, according to Compton 
and Debye, could easily be explained by Einstein’s light quantum hypothesis; the wave 
theory of light, on the contrary, failed to explain this experiment. With that result, the 
problems of radiation theory which had hardly advanced since Einstein’s works of 1906, 
1909 and 1917 were opened up.’ (Heisenberg, 1929) 


In his reminiscences in the year before he died, Compton stated: 


“These experiments were the first to give, at least to physicists in the United States, a 
conviction of the fundamental validity of the quantum theory.’ (Compton, 1961) 


9.2 Bose-Einstein statistics 
|) 


One of the intriguing questions about Planck’s derivation of the black-body energy distri- 
bution is why he obtained the correct answer using the ‘wrong’ statistical procedures. One 
answer is that he may well have worked backwards from his definition of the entropy of 
a set of oscillators in thermal equilibrium, as discussed in Sect. 2.6. The deeper answer 
is that Planck had stumbled by accident upon the correct method of evaluating the statis- 
tics of indistinguishable particles. These procedures were first demonstrated by the Indian 
mathematical physicist Satyendra Nath Bose in a manuscript entitled Planck’ law and the 
hypothesis of light quanta, which he sent to Einstein in 1924 (Bose, 1924). Einstein imme- 
diately appreciated its deep significance, translated it into German himself and arranged for 
it to be published in the journal Zeitschrift fiir Physik. Bose’s paper and his collaboration 
with Einstein led to the establishment of the method of counting indistinguishable parti- 
cles known as Bose-Einstein statistics, which differ radically from classical Boltzmann 
statistics. 

Bose was not really aware of the profound implications of his derivation of the Planck 
spectrum. To paraphrase Pais’s account of his paper, Bose introduced three new features 
into statistical physics: 


(i) Photon number is not conserved. 

(11) Bose divides phase space into coarse-grained cells and works in terms of the numbers 
of particles per cell. The counting explicitly requires that, because the photons are 
taken to be identical, each possible distribution of states should be counted only once. 
Thus, Boltzmann’s axiom of the distinguishability of particles is gone. 

(iii) Because of this way of counting, the statistical independence of particles has gone. 
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9.2 Bose-Einstein statistics 


These are profound differences as compared with classical Boltzmann statistics.” As Pais 
remarks, 


‘The astonishing fact is that Bose was correct on all three counts. (In his paper, he 
commented on none of them.) I believe there had been no such successful shot in the dark 
since Planck introduced the quantum in 1900.’ (Pais, 1985) 


The argument starts by dividing the volume of phase space into elementary cells.* 
Consider one of these cells, which we label k and which has energy &; and degeneracy 
gx, the latter meaning the number of available states with the same energy e+ within that 
cell. Now suppose there are ng particles to be distributed over these g, states and that the 
particles are identical. Then, the number of different ways the nx particles can be distributed 
over these states is 


(nk + gx — 1)! 


nd! 


using standard procedures in statistical physics. This is the key step in the argument and 
differs markedly from the corresponding Boltzmann result. In the standard Boltzmann 
procedure, all possible ways of distributing the particles over the energy states are included 
in the statistics, whereas in (9.4) duplications ofthe same distribution are eliminated because 
of the factorials in the denominator. Notice that this is the point at which the statistical 
independence of the particles is abandonned. The particles cannot be placed randomly in 
all the cells since duplication of configurations is not allowed. 

The result (9.4) refers only to a single cell in phase space and we need to extend it to all 
the cells which make up the phase space. The total number of possible ways of distributing 
the particles over all the cells is the product of all numbers such as (9.4), that is, 


m+g- D! 
r= e _ 


We have not specified yet how the N = )°, ną particles are to be distributed among the k 
cells. To do so, we ask as before, ‘What is the arrangement of nz over the states which results 
in the maximum value of W?’ At this point, we return to the recommended Boltzmann 
procedure. First, Stirling’s theorem is used to simplify In W: 


(nk + Bk 1)! (nk + guter 
” =l] ner)! 2 nen a 


Now we maximise W subject to the constraints }°,nz = N and Y nex = E. Using the 
method of undetermined multipliers as before, 


ö(In W) = 0 = X Snille + nr) — Inn] — a — Bex} , 
k 


so that 


Ing + nx) — lnn] —a — Be, = 0, 
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and finally 


&k 


San (9.7) 


nk = 


This is known as the Bose-Einstein distribution and is the correct statistics for counting 
indistinguishable particles. 

In the case of black-body radiation, we do not need to specify the number of photons 
present. We can see this from the fact that the distribution is determined solely by one 
parameter — the total energy, or the temperature of the system. Therefore, in the method of 
undetermined multipliers, we can drop the restriction on the total number of particles. The 
distribution automatically readjusts to the total amount of energy present, and so a = 0. 
Therefore, 


_ __ & 
ehr] 





(9.8) 


nk 


By inspecting the low frequency behaviour of the Planck spectrum, we find that 6 = 1/kT, 
as in the classical case. 

Finally, the degeneracy of the cells in phase space g; for radiation in the frequency 
interval v to v + dv has already been worked out in our discussion of Rayleigh’s approach 
to the origin of the black-body spectrum (Sect. 2.3.4). One of the reasons for Einstein’s 
enthusiasm for Bose’s paper was that Bose had derived this factor entirely by considering 
the phase space available to the photons, rather than appealing to Planck’s or Rayleigh’s 
approaches, which relied upon results from classical electromagnetism. Planck’s analysis 
was entirely electromagnetic and Rayleigh’s argument proceeded by fitting electromagnetic 
waves within a box with perfectly conducting walls. Bose considered the photons to have 
momenta p = hv/c and so the volume of momentum, or phase, space for photons in the 
energy range hv to h(v + dv) is, using the standard procedure, 

3,2 
4r h?’ vt dv y 


dV, = V dp, dp, dp, = V47 p° dp = - ; (9.9) 
Cc 


where V is the volume of real space. Now Bose considered this volume of phase space to 
be divided into elementary cells of volume h°, following ideas first stated by Planck in his 
1906 lectures, and so the numbers of cells in phase space was 


Arv2 dv 
& 





dN, = (9.10) 


He needed to take account of the two polarisation states of the photon and so Rayleigh’s 
result was recovered, 


87 v2 


dN = 
3 





dv with eg, =hv. (9.11) 


We immediately find the expression for the spectral energy density of the radiation 


8rhv3 1 


This is Planck’s expression for the black-body radiation spectrum and it has been derived 
using Bose-Einstein statistics for indistinguishable particles. 
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9.3 De Broglie waves 


Einstein did not stop there, but went on to apply these new procedures to the statistical 
mechanics of an ideal gas (Einstein, 1924, 1925). As he stated, the application of these 
statistics to a monatomic gas leads to a ‘far-reaching formal relationship between radiation 
and gas’. In particular, he realised that the expression he had derived for the fluctuations in 
black-body radiation in his remarkable paper of 1909 (Sect. 3.6) must apply equally for the 
statistics of monatomic gases as well. It will be recalled that his expression (3.43) for the 
fluctuations of black-body radiation consists of two parts: 


2 3 
2 = (2+ c J (9.13) 





e2 E 8m v? V dv 


This is a dramatic result. Simply from the statistics of indistinguishable particles, the 
expression for the fluctuations consists of one term associated with the statistics of non- 
interacting particles according to the Maxwell-Boltzmann prescription and a second term 
associated with interference phenomena due to the wave properties of the particles. For 
these reasons, Einstein was particularly intrigued by the work of Louis de Broglie, which 
was reported by Langevin at the Fourth Solvay Conference in April 1924. De Broglie 
had made another ‘shot in the dark’, by ascribing wave properties to the electron. This 
conjecture was to have profound implications for the development of quantum mechanics. 


9.3 De Broglie waves 
Zn” > Fü 


The experimental and theoretical development of quantum physics was largely concentrated 
in the major centres in Germany, Göttingen, Munich and Berlin, as well as at Bohr’s Institute 
in Copenhagen. Following the end of the First World War, communications were broken 
between France and Germany and the rapid exchange of results and ideas was not possible 
until the 1920s. Paris was somewhat off the main track of the development of quantum 
theory, but in many ways this was an advantage for the remarkable insights which resulted 
from Louis de Broglie’s research. Unlike the majority of German physicists, de Broglie 
had no hesitation in adopting Einstein’s light-quantum hypothesis. From the very beginning 
of his doctoral studies, his objective was to find techniques for reconciling the wave and 
particle descriptions of the behaviour of light. But he went much further and single-handedly 
introduced the concept of matter waves which completed the wave-particle duality of matter 
as well as light. Let us follow his reasoning. 

De Broglie’s three important papers were published in Comptes Rendus (Paris) in 1923 
and were brought together in a single paper in English in the Philosophical Magazine in 1924 
(de Broglie, 1923a,b,c, 1924b). He noted that there are two different ways of associating 
a frequency with a moving electron. He began by associating a frequency with the rest 
mass energy of the particle through Planck’s relation Avo = moc’. If the particle moves at 
velocity v, the frequency associated with the particle in the external frame of reference will 
be greater since then hv = ymoc?, where y = (1 — v?/c?)~'/” is the Lorentz factor. On the 
other hand, because of time dilation between the frame of reference of the moving particle 
and the external observer, time intervals are increased and so the ‘internal frequency’ is 
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observed with a lower frequency v; = moc?/y. The relation between these frequencies is 
v =v/y?. (9.14) 


How are these frequencies to be interpreted? He identified v with a ‘fictitious wave’ 
associated with the motion of the particle which propagated at a ‘superluminal’ velocity 
c?/v. Because the wave moved at a speed greater than the speed of light, it could not 
transport energy. At time ¢ = 0, the location of the particle and the wave coincided. At time 
t, the particle had moved to the position x = vt and the amplitude of its internal periodic 
motion would be observed to be sin(27v,x/v). At the same time, the amplitude of the 
‘fictitious wave’ would be sin[2z v(t — xv/c?)] since its propagation speed is c”/v. But, 
because of (9.14), it follows that the two wave motions are always in phase. Inverting the 
argument, if the two waves are always to remain in phase, it follows that the ‘fictitious 
wave’ must propagate at speed c?/v. 

Now de Broglie identifies the speed of the fictitious wave, c?/v, with the phase velocity 
Uph Of the wave. Then, the packet of energy associated with the wave will be propagated at 
the group velocity vg which in modern notation we write 


_ dw 


=a, (9.15) 
We can write k = 2n/A = 27 v/vph and so 
dv . 1 moc? ) 
Ten De a ee ee 


where we have written the speed of the electron as 6 = v/c. Evaluating the differential to 
obtain v,, we find vg = v, that is, the group velocity of the fictitious waves is exactly equal 
to the velocity of the electron. As de Broglie writes in his paper (de Broglie, 1923a), 


‘The velocity of the moving body is the energy velocity of a group of waves having 
frequencies v = 7 F and velocities 7 corresponding to very slightly different 
values of ß. 
This represents a key insight into how to reconcile the wave and particle properties of light 
quanta. But much more is to follow. 

Next, de Broglie notes a further suggestive result by comparing the path of a light ray 
according to Fermats principle of least time in optics with that of a particle according to the 
principle of least action. Fermat’s principle is that, in a medium of variable refractive index, 
the path of a light ray is that which minimises the travel time between any two fixed points, 
ô f ds/v = 0, where ds is the element of path length. Since the frequency is a constant and 


v = Av, this can equally be written 
ds 
öl —=0, 9.17 
fe on 


where A the wavelength. Now, A = vp,/v and since 


1 me 


v= g Be and m=c/v=c/ß, (9.18) 
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we find 


(9.19) 


Now, de Broglie treats the motion of a particle according to the principle of least action, 
which can be written in the form 





=0, (9.20) 


[Lage 


where £ is the Lagrangian of the particle.° For a relativistic particle of mass mo moving in 
a stationary electric potential d(r), the Lagrangian takes the form 


2 








moc v 
L=- _ gelr) = -me (: = =) - gor) 
yY c 
„2 2 »2 / 
ee È A = ) qo(r), (9.21) 





where y = (1 — v*/c?)~'/? is the Lorentz factor.’ Taking q1 = x, q2 = y, q3 = z, (9.20) 


becomes 
2 
ee ame (9.22) 
1-8ß 1-8 


This is exactly the same expression as (9.19). De Broglie immediately concluded: 
“The rays ofthe phase wave are identical with the paths which are dynamically possible.’ 
Now de Broglie makes the key assumption. 


‘It seems quite necessary that the phase wave shall find the electron in phase with itself. 
That is to say “The motion can only be stable ifthe phase wave is tuned with the length 
of the path.” ’ 


To achieve this, there must be an integral number of wavelengths around the orbit. Then, 
(9.19) can be written 


[> -f ea =n, (9.23) 


where n is an integer and T, the period of revolution. We can now rewrite the last equality 
in (9.23) in terms of the momentum of the electron. 





T, 2.2 
© mop“ ymov 
ds=n, 9.24 
Í h/1— 8? Fri h Oe) 
and so 
fras = Pr dd=nh, (9.25) 


exactly the Bohr-Sommerfeld quantisation condition. 
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There is another way of looking at this result. Consider the circular motion of the electron 
and its associated wave according to the Bohr model of the hydrogen atom. If 7; is the period 
of the orbit of the electron, the ‘fictitious wave’ and the electron would meet on their circular 
orbit after a time t given by equating the distance the fictitious wave travels round the orbit, 
(c?/v)t, to the distance moved by the electron plus the circumference of the orbit, vr + v7,. 
Therefore, writing ds = vdt, we find 





v? /c? 
= ——— 7. 9.26 
er; /c? en) 
Therefore, the internal phase of the electron is changed by 
2 2 2 
_ _ g2y1/2 B _ moc T,B 
2rvır =2nvoll B ) x jr (9.27) 


This should be set equal to 2rn so that the electron and fictitious wave remain in phase. 
For circular orbits, we therefore recover again the relation 


[ moß?c? TE y mov 
0 hyl- p h 


This is de Broglie’s famous formula for the quantisation of the ‘fictitious waves’ which 
travel at the phase velocity c?/v round the electron’s orbit. The associated group velocity at 
which the energy of the wave-packet travels round the orbit is just the velocity of the particle 
v. As he remarked later in his paper, ‘I think that these ideas may be considered as a kind 
of synthesis of optics and dynamics’. The second paper also contained the prediction that 
the conclusions of his papers applied to electrons as well as to quanta of radiation and that 





ds =n. (9.28) 


‘a stream of electrons passing through a sufficiently narrow hole should also exhibit 
diffraction phenomena.’ (de Broglie, 1923b) 


The closing paragraph of his great paper shows due caution and yet optimism that these con- 
siderations represented a significant advance towards a proper theory of quantum mechanics. 


‘Many of these ideas may be criticised and perhaps reformed, but it seems that now 
little doubt should remain of the existence of light quanta. Moreover, if our opinions are 
received, as they are grounded on the relativity of time, all the enormous experimental 
evidence of the “quantum” will turn in favour of Einstein’s conceptions.’ 


De Broglie went on to submit his doctoral thesis Researches on the theory of quanta in 
November 1923 to the Faculty of Science at the University of Paris (de Broglie, 1924a). 
The examining committee, consisting of Perrin, Cartan, Mauguin and Langevin, praised 
the striking originality of de Broglie’s researches but was sceptical of the physical reality 
of the waves associated with the electrons. When asked about experimental tests of the 
hypothesis, de Broglie proposed the diffraction of beams of electron by crystals. It was not 
realised that experimental evidence for diffraction effects had already been discovered, as 
we recount in the next section. 

A second important outcome was that Langevin discussed de Broglie’s researches at the 
Fourth Solvay Conference held in April 1924, in particular with Einstein. At that time, 
Einstein was working on the implications of Bose’s paper described in the last section and 
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realised the significance of de Broglie’s hypothesis for his theory of ideal gases, in particular, 
for the energy fluctuation formula and the fluctuation component due to the wave aspects 
of the particles. Einstein received a copy of de Broglie’s dissertation in December 1924. He 
realised the significance of these ideas stating: 


‘I shall discuss the interpretation in greater detail because I believe that it involves more 
than merely an analogy.’ (Einstein, 1925) 


9.4 Electron diffraction 
nn 


Einstein discussed de Broglie’s thesis with Born who in turn brought it to the attention of 
James Franck, head of the department of experimental physics at Göttingen, and Born’s 
student Walter Elsasser. When Elsasser suggested that electron diffraction experiments 
might be attempted, Franck commented that this: 


“would not be necessary since Davisson’s experiments had already established the expected 
effect.’ 


What Franck was referring to were the experiments of Davisson and Kunsman which had 
shown considerable structure in the angular distribution of electrons scattered from nickel 
surfaces (Davisson and Kunsman, 1921). They had interpreted these results in terms of the 
scattering of electrons which penetrate one or more of the outer electron shells of the nickel 
atoms at the surface of the crystal. Elsasser and Franck interpreted the structure in the 
angular distribution of the scattered electrons differently as the peaks associated with the 
scattering of the electrons by the nickel crystal. The characteristics of the scattered electrons 
were similar to the well-known phenomena of the scattering of X-rays by crystal surfaces 
and agreed with the expectations of de Broglie’s theory if the waves associated with the 
electrons have wavelength A = h/mv. Elsasser also showed that interference effects could 
also explain the Ramsauer-Townsend effect, the fact that the cross-section for scattering 
of slow electrons by rare gases, particularly that of argon, decreases to a very low value 
below electron energies of 25 eV (Ramsauer, 1921; Townsend and Bailey, 1922). Elsasser’s 
note on the interpretation of these phenomena was published in Die Naturwissenschaften 
in August 1925 (Elsasser, 1925). 

Davisson was not, however, convinced but continued his experiments on the scattering 
of electrons by nickel crystals. The remarkable series of events which led to his famous dis- 
covery is recounted in the opening paragraphs of his paper with Germer of 1927 (Davisson 
and Germer, 1927). 


‘The investigation reported in this paper was begun as the result of an accident which 
occurred in this laboratory in April 1925. At that time we were continuing an investigation, 
first reported in 1921 (Davisson and Kunsman, 1921), of the distribution-in-angle of 
electrons scattered by a target of ordinary (poly-crystalline) nickel. During the course of 
this work a liquid-air bottle exploded at a time when the target was at a high temperature; 
the experimental tube was broken, and the target heavily oxidized by the inrushing air. 
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The results of the Davisson and Germer electron scattering experiments. The curves labelled A and B show the 
diffraction maximum before (A) and after (B) the heating and recrystallisation of the nickel crystals for the ‘54 volt’ 
electron beam (Davisson and Germer, 1927). 


The oxide eventually reduced and a layer of the target removed by vaporization, but only 
after prolonged heating at various temperatures in hydrogen and in vacuum. 

When the experiments were continued it was found that the distribution-in-angle of the 
scattered electrons had been completely changed.’ 


What had happened was that the original nickel sample consisted of a number of separate 
crystals. The effect of the processing subsequent to the accident was that the separate crystals 
were fused into a single nickel crystal with much improved diffraction capabilities. The 
improvement is illustrated in Fig.9.2 which shows the famous diffraction maximum at 
an angle of 50° using the original crystals and the larger single crystal. This maximum 
disappeared if the accelerating voltage was appreciably changed. The separation a between 
the planes of atoms in the nickel crystal, the grating constant, was 2.15 A and so Bragg’s 
formula for the diffraction of X-rays, nA = 2ra sin 0, can be used to estimate the wavelength 
A of the incident beam, where n is an integer and @ is the angle between the incident ray 
and the scattering planes. This formula resulted in a wavelength of A = 1.65 A, in perfect 
agreement with the expectation of de Broglie’s formula A = h/mv = (150/V)!/? A for an 
accelerating voltage V = 54 V. 

In fact these results were not published until 1927, although they were well-known 
through discussions with Born, Hartree and others at the 1926 meeting of the British 
Association for the Advancement of Science held in Oxford. The clinching experiments 
were carried out by George P. Thomson, the son of J. J. Thomson, and Andrew Reid at 
the University of Aberdeen. They passed a collimated beam of electrons through a thin 
film of celluloid and the resulting diffraction pattern was similar to that observed in the 
diffraction of X-rays of the same energy as the electrons (Thomson and Reid, 1927). In 
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Fio. 4.—Gold, Fie. 5.—Celluloid. Fio. 6.—Film X. 


Examples of photographs of the diffraction rings observed in electron diffraction experiments in which a beam of 
electrons is incident upon thin films of different materials (Thomson, 1928). 


a subsequent paper, Thomson (1928) demonstrated diffraction rings associated with the 
passage of electrons through thin films of gold, celluloid and other substances (Fig. 9.3). 
There followed numerous experiments demonstrating the wave nature of the diffraction 
of beams of electrons. The 1937 Nobel Prize for physics was awarded to Davisson and 
Thomson. As famously remarked by Jammer (1989), 


“... Thomson, the father, was awarded the Nobel prize for having shown that the electron 
is a particle, and Thomson, the son, for having shown that the electron is a wave.’ 


These electron diffraction experiments provided incontrovertible evidence for the correct- 
ness of de Broglie’s deep insight into the wave-particle duality for light and matter. We 
have, however, now run far ahead of the chronological unfolding of the story. Let us review 
the status of the old quantum theory as it stood at the end of 1924. 


9.5 What had been achieved by the end of 1924 


The discussion of the last six chapters demonstrates the remarkable efforts undertaken by 
many outstanding physicists and mathematicians to come to terms with the challenges posed 
by the introduction of quantisation and quanta into the infrastructure of classical physics. It 
will be recognised that most of the distinctive features which a successful quantum theory 
had to encompass were already in place, but there was no coherent quantum theory which 
could accommodate all of them, let alone the deeper implications of what these discoveries 
implied for the understanding of the physical world. It is helpful to list what had been 
achieved by the end of 1924. 


e The Bohr model of the atom and the concept of stationary states which electrons can 
occupy in orbits about the atomic nucleus (Chap. 4). 

e The introduction of additional quantum numbers to account for multiplets in atomic 
spectra and the associated selection rules for permitted transitions (Chap. 5 and Sects. 6.4 
and 6.5). 
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e Bohr’s correspondence principle which enabled classical physical concepts to inform the 
rules of the old quantum theory (Sect. 6.3). 

e The explanation of the Stark and Zeeman effects in terms of the effects of quantisation 
upon atoms in electric and magnetic fields (Chap. 7). 

e Lande’s vector model of the atom which could account for the splitting of spectral lines 
in terms of the addition of angular momenta according to the prescription given by the 
expression for the Landé g-factor (Sect. 7.4). 

e The experimental demonstration of space quantisation by the Stern—Gerlach experiment 
(Sect. 7.5). 

e The concept of electron shells in atoms and the understanding of the electronic structure 
of the periodic table (Chap. 8). 

e The discovery of Pauli’s exclusion principle and the requirement of four quantum numbers 
to describe the properties of atoms (Sect. 8.3). 

e The discovery of electron spin as a distinctive quantum property of these particles — 
although inferred from angular momentum concepts, the intrinsic spin of the electron is 
not ‘angular momentum?’ in the sense of rotational motion (Sect. 8.4). 

e The confirmation of the existence of light quanta as a result of Compton’s X-ray scattering 
experiments (Sect. 9.1). 

e The discovery of Bose-Einstein statistics which differed very significantly from classical 
Boltzmann statistics (Sect. 9.2). 

e The introduction of de Broglie waves associated with the momentum of the electron 
and the extension of the wave-particle duality to particles as well as to light quanta 
(Sect. 9.3). 


The reader will immediately recognise that the above list is the product of hindsight, 
highlighting features which would have to be incorporated into a new theory of what was 
to become quantum mechanics. The old quantum theory could not accommodate these into 
a single coherent theory. In essence, the old quantum theory can be considered classical 
physics with the addition of the Bohr-Sommerfeld quantisation rules, namely, 


f podp =nh, (9.29) 


as well as a set of selection rules to ensure that the observed optical and X-ray spectral 
features could be reproduced. 

Besides the problems of coherence, there was the highly non-trivial problem that, al- 
though the properties of the hydrogen atom could be explained rather well by the Bohr- 
Sommerfeld model, the next element in the periodic table, helium with two electrons rather 
than one and twice the positive nuclear charge, defied explanation despite the strenuous 
efforts of many theorists. For heavier elements, the problems were no simpler. 

But even more fundamentally, the stability of atoms, particularly the decay of the electron 
orbits because of the emission of electromagnetic radiation had no satisfactory solution. 
The old quantum theory got round this problem by simply requiring that accelerated 
electrons at the atomic level did not radiate electromagnetic radiation. The best that could 
be done was to assert that Bohr’s stationary states genuinely were ‘stationary’ and at the 
atomic level the continuous loss of energy did not occur. Energy loss only occurred in 
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transition between stationary states. Einstein had proposed how this could be formalised 
in his remarkable introduction of spontaneous and induced transition probabilities between 
stationary states and, as we will see, this was to be the touchstone for the development of 
approaches which would ultimately lead to the revolution of a genuine theory of quantum 
mechanics. Something had gone profoundly wrong with classical mechanics and dynamics 
at the atomic level — the solution of that problem was to prove to be truly revolutionary. 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:53:49 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.010 
Cambridge Books Online © Cambridge University Press, 2014 





Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:53:49 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.010 
Cambridge Books Online © Cambridge University Press, 2014 








THE DISCOVERY OF 
QUANTUM MECHANICS 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:54:13 GMT 2014. 
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9781 139062060 
Cambridge Books Online © Cambridge University Press, 2014 





Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:54:13 GMT 2014. 
http://ebooks.cambridge.org/ebook.jsf?bid=CBO9781 139062060 
Cambridge Books Online © Cambridge University Press, 2014 








10 


The collapse of the old quantum theory and 





the seeds of its regeneration 


‘In spite of its high-sounding name and its successful solutions of numerous problems 
in atomic physics, quantum theory, and especially the quantum theory of polyelectron 
systems, prior to 1925, was, from the methodological point of view, a lamentable hodge- 
podge of hypotheses, principles, theorems and computational recipes rather than a logical 
consistent theory. Every single quantum-theoretic problem had to be solved first in terms 
of classical physics; its classical solution had then to pass through the mysterious sieve of 
the quantum conditions or, as it happened in the majority of cases, the classical solution 
had to be translated into the language of quanta in conformance with the correspondence 
principle. Usually, the process of finding the ‘correct solution’ was a matter of skillful 
guessing and intuition, rather than of deductive or systematic reasoning.’ (Jammer, 1989) 


Although written with the benefit of hindsight, there is no doubt that, by the end of 1924, 
there was a major crisis in the attempts to create a system of ‘quantum mechanics’! which 
could encompass all the features of atoms and their spectra. At the heart of the problem 
was the wave-particle duality first enunciated by Einstein in 1905 and reinforced by de 
Broglie’s remarkable association of ‘matter-waves’ with electrons in 1924. As recorded by 
Jammer (1989), 


‘This state of affairs was well characterised by Sir William Bragg when he said that 
physicists are using on Mondays, Wednesdays, and Fridays the classical theory and on 
Tuesdays, Thursdays and Saturdays the quantum theory of radiation.’ 


The transition from the old quantum theory to the completed theory of quantum mechanics 
took place remarkably rapidly, over a period of only a few years, but it was not a simple 
story. Born remarked that it was not even ‘a straight staircase upward’, but rather a ‘tangle 
of interconnected alleys.’ 

The story begins with one of the thorniest problems, the understanding of the physics of 
the dispersion of light by material media. Despite the apparently intractable nature of the 
problem, the attempts to unravel it were to lead to new approaches which would soon lead 
to a radically new description of physics at the atomic level. 


10.1 Ladenburg, Kramers and the theory of dispersion 


189 


The dispersion problem is at the heart of the interaction between matter and radiation. 
Classically, the dispersion of electromagnetic waves results from the dependence upon 
frequency of the refractive index of the medium through which the waves propagate. The 
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simplest example occurs in continuous media in which there is a linear dependence of the 
polarisation P of the medium upon the applied electric field strength E, P = x €  E, where 
x is the electric susceptibility of the medium. It is straightforward to solve the first two 
of Maxwell’s equations (1.5) and (1.6) for an incident electromagnetic wave. In a linear 
medium, 


D=-oüE+P=(l+x)oE=eoE, (10.1) 


where e is the relative permittivity of the material. Going through the standard procedure? 
of reducing (1.5) and (1.6) to a wave equation in E or H, the speed of propagation 
of the waves is (eeouo) "/? = c/./€. The refractive index of the material n = Je is a 
readily measurable quantity. The term dispersion is used to describe the phenomena which 
occur when the refractive index is a function of frequency, resulting in the ‘dispersion’ or 
‘smearing out’ of the frequency components of a wave-packet. 

In the presence of an absorption line, there is a strong dependence of x upon frequency 
and it had been established empirically that the variation of x with frequency could be 
written in the form 


e fi 


EL Y 
Me W? — œ? 


X= (10.2) 
where w; is the central frequency of the absorption line and f; is a constant, the significance 
of which will become apparent shortly. 

A relation of this form can be derived from classical electromagnetic theory and, in 
fact, we have already almost derived it in Sect. 2.3.3 — the classical theory of dispersion 
was expounded by Paul Drude (1900). Formally, the polarisation P is the dipole moment 
per unit volume of the material,’ in this case, induced by the electric field of the incident 
electromagnetic wave. For convenience, we repeat the formula (2.9) which describes the 
motion of an oscillator of natural angular frequency wo under the influence of an incident 
electromagnetic wave: 

z 2122 F 

X + yx + ox =—. (10.3) 
For simplicity, we take the reduced mass of the oscillator to be the electron mass me. If 
the oscillator is accelerated by the E field of an incident wave, F = eEy exp(iœt). To find 
the response of the oscillator, a trial solution for x of the form x = xo exp (iœt) is adopted. 
Then 


eEy 


Me (© — w? + iyo) i 





xo = 


(10.4) 


The complex factor in the denominator means that the oscillator does not vibrate in phase 
with the incident wave. The dipole moment of the oscillator is p = ex with respect to its 
rest position and so, for a single oscillator, multiplying through by e, we find 

e? E 0 


me (o — @ + iyo) 





(10.5) 
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Multiplying by the complex conjugate of (10.5) and taking the square root, we find the 
amplitude of the induced dipole moment, 


e Eo 
5 2 1/2° 
Me [ («2 — aw)" + yo] 
Thus, provided the frequency is not too close to the resonance frequency wo, the second 


term in square brackets in the denominator can be neglected and the expression for the 
electric susceptibility xo of a single oscillator is 





ipl = (10.6) 


e 


Xo = (10.7) 


me (w = o?) ` 
exactly the same as the relation (10.2). Summing over all the oscillators with angular 
frequencies œw; in unit volume of the material, Drude’s formula for the electric susceptibility 
is obtained, 


2 , 
en a _. (10.8) 


Me w; — w? 


where the factor f; was interpreted by Drude as the ‘number of dispersion electrons’ per 
atom. The same formula was derived by Pauli in his important review of 1926 where f; was 
referred to as the ‘strength’ of the oscillator (Pauli, 1926). This variation of the dispersion 
in the vicinity of an absorption line was known as anomalous dispersion. If measurements 
of the anomalous dispersion were measured away from the line centre, the value of f; could 
be measured for the oscillator. 

The seeds of a new approach to the wave-particle duality were contained in a paper by 
Ladenburg (1921) who combined Drude’s theory with Einstein’s quantum theory of the 
emission and absorption of radiation (Einstein, 1916) discussed in Sect. 6.2. Consider a 
system of N oscillators each of mass me, charge e and oscillation frequency vo = wo/27 
in thermal equilibrium in an enclosure. The classical average energy loss rate of a single 
oscillator is given by (2.4), 


= Sy (10.9) 


where y = wie? /6rteoc’m. The radiation loss rate of the system of oscillators J. was 
written in the following form by Ladenburg, 


Jy = —, (10.10) 


where t = 1/y and U is the mean energy of each oscillator. As shown by Planck, the mean 
energy of the oscillator is directly related to the mean energy density of radiation within 
the enclosure. In Sect. 2.3.3, this relation was derived for the case in which the motion of 
the electron could be modelled by three orthogonal oscillators and so for a single oscillator, 
(2.20) can be written, 

870 Vp — 20% = 


=- U= U. 10.11 
u(vo) = -z3 3703 (10.11) 
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Notice that u(vo) is the energy density of radiation per unit frequency interval. Therefore, 
substituting for U in (10.10), the classical rate of loss of energy of the oscillator in the 
enclosure can be written 


e 





Ja Nu(vo) . (10.12) 


4egme 
This expression relates the rate of loss of energy directly to the spectral energy density 
u(vo) of the radiation field at frequency vo. 

Ladenburg now tackled the same problem from the quantum theoretical point of view 
using the Einstein A and B coefficients discussed in Sect. 6.2: 





B? and B? =B”. (10.13) 


m 3 m 


m labels the lower energy state &,, and n the upper state &, and so Avo = En — £m. The rate 
of absorption of energy by the set of oscillators is therefore 


Jou = hvoNn B? u(vo) . (10.14) 


Ladenburg next used Bohr’s correspondence principle to equate the classical energy loss 
rate to the quantum theoretical rate of absorption of energy, Ja = Jqu, and so from (10.13), 


Comec? 
Nei 5M. (10.15) 
27 ve? 


According to Bohr, the correspondence principle should only be applied to transitions 
between states with large principal quantum numbers, but Ladenburg assumed that (10.15) 
should apply for all quantum transitions. Next, he identified N/N,, with f;, the number of 
oscillators per atom derived in (10.8). We recall that classically Drude identified f; as the 
number of dispersion electrons per atom. In Ladenburg’s exposition, N,, is taken to be the 
number of atoms in the ground state. Therefore, 


3 
E€oMeC 
ft. 10.16 
27 ve? ( ) 


Thus, Ladenburg had derived an expression for Einstein’s spontaneous emission coefficient 
A” in terms of measurable quantities since f; can be found from the frequency dependence 
of the polarisation properties of the medium. Inserting (10.16) into (10.8), we find the 
relation between the polarisation and the spontaneous transition probabilities, 


E An 
P=xE=2Y mo (10.17) 





This expression can be thought of as describing the amplitude of the varying electric dipole 
moment of the medium which is to account for dispersion phenomena. But, implicitly 
Ladenburg had made a crucial conceptual advance, which was not explicitly stated in his 
paper. As noted by van der Waerden (1967), 


‘Ladenburg replaced the atom, so far as its interaction with the radiation field is concerned, 
by a set of harmonic oscillators with frequencies equal to the absorption frequencies vo 
of the atom.’ 
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This was explicitly recognised by Bohr, Kramers and Slater (1924) who referred to the 
model as consisting of a set of virtual oscillators. 

The result (10.17) was generalised by Kramers (1924) for the case in which the lower 
state of the transition was not the ground state. In this case, he argued that (10.17) should 
include the contribution of virtual oscillators with energies € < &,. To convert (10.17) into 
a more convenient form, let us label the energy level of the state we are interested in as i, 
higher energy states by k and lower energy states by k’. Then, vo = vir = (&x — &;)/h and 
ve; = (£; — &ri)/ h. Thus, (10.19) becomes 

Af 


QE? 
Pash l (10.18) 
823 = v2. (v2, — v2) 





Kramers showed that a similar additional term to that in (10.17) should be included when 
the transitions are not to the ground state so that the full expression of the polarisation 
would read 





aE? Ak Ai, 
P= xE = E i - (10.19) 
873 2 v2.(v2, — v2) = v2,(v2, — v2) 
Ex >E; Ey<E; 


Kramers realised that this second term had a somewhat curious significance, but that it had 
to be there. As he wrote in his paper to Nature, 


‘The reaction of the atom against the incident radiation can thus be formally compared 
with the action of a set of virtual harmonic oscillators inside the atom, conjugated with 
the different possible transitions to other stationary states....one might introduce the 
following terminology: in the final state of the transition the atom acts as a “positive 
virtual oscillator” of relative strength + f; in the initial state, it acts as a negative virtual 
oscillator of strength — f. However unfamiliar this “negative dispersion” might appear 
from the point of view of the classical theory, it may be noted that it exhibits a close 
analogy with the “negative absorption” which was introduced by Einstein, in order to 
account for the law of temperature radiation on the basis of the quantum theory.’ 


The connection with the correspondence principle was rubbed home in the last sentence of 
his paper 
‘It may be remembered, however, that the presence of the second term in [10.19] is 


necessary if the classical theory can be applied in the limiting region where the motions 
in successive stationary states differ by only small amounts from each other.’ 


Kramers did not give the proof of (10.19) in his paper of 1924, but it was derived in detail 
in his paper with Heisenberg in the following year (Kramers and Heisenberg, 1925). 
Without going through the details of that proof, we can understand why the second term 
on the right-hand side of (10.19) is necessary. It will be recalled that, in Einstein’s derivation 
of Planck’s formula using Einstein’s A and B coefficients, it is essential to incorporate the 
induced emission term in order to obtain the correct form of the black-body spectrum in 
the long wavelength, or Rayleigh—Jeans, limit. If the term is omitted, the Wien, rather than 
the Planck, distribution is obtained. Now, according to the correspondence principle, the 
classical and quantum formalisms should coincide in the limit of large quantum numbers 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:54:36 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.011 
Cambridge Books Online © Cambridge University Press, 2014 





194 


The collapse of the old quantum theory 


and this corresponds exactly to the Rayleigh-Jeans region of the Planck spectrum. This 
explains why the ‘negative virtual oscillators’ need to be included in (10.19). This important 
feature of the formalism was clarified in a paper by John van Vleck (1924). Again, the 
correspondence principle provides guidance about what the correct form of the quantum 
relations has to be. 


10.2 Slater and the Bohr-Kramers-Slater theory 


The next contribution to the ‘tangle of interconnected alleys’ was made by John C. Slater 
who completed his doctorate at Harvard University in 1923 and arrived in Copenhagen as 
a Sheldon Fellow of Harvard University in 1924. The source of Slater’s insight came from 
the difficulty of reconciling the classical picture of electromagnetic waves with Einstein’s 
concept of light quanta. In particular, how could exact conservation of energy and momen- 
tum be maintained when a continuous electromagnetic wave interacted with the discrete 
energy levels within atoms? As expressed by Jammer (1989), 


‘|. it was hard to understand how, for example, in a system composed of an electromag- 
netic radiation field, susceptible of only continuous changes of energy, and an aggregate of 
atoms, emitting and absorbing only discrete quanta of energy, the sum total of a continuous 
and of a discrete amount of energy could be a constant. ... But there was an alternative: 
one could reject the energy principle as an exact law and regard it merely as a statistical 
law.’ 


This concept had been advocated by Darwin (1922) who remarked that, 


‘. . . as pure dynamics has failed to explain many atomic phenomena, there seems no reason 
to maintain the exact conservation of energy, which is only one of the consequences of 
the dynamical equations.’ 


Darwin’s paper of 1922 on the quantum theory of dispersion contained within it ideas 
which were to be taken over by Slater. In particular, according to Darwin’s theory, when an 
electromagnetic wave interacts with an atom, the atom acquires a probability of emitting 
a spherical wave-train which interferes with the incident waves, the probability being a 
function of the intensity of the incident radiation. 

Slater (1924) went much further, suggesting that, so long as an atom is ina stationary state, 
each atom can communicate with all other atoms through the action of a virtual radiation 
field originating from the virtual oscillators which have frequencies associated with the 
quantum transitions between stationary states — these are the same virtual oscillators which 
were introduced by Ladenburg and Kramers in their theory of dispersion. Here are Slater’s 
own words. 


‘Any atom may, in fact, be supposed to communicate with other atoms all the time it is in a 
stationary state, by means of a virtual field of radiation originating from oscillators having 
the frequencies of possible quantum transitions and the function of which is to provide for 
the statistical conservation of energy and momentum by determining the probabilities for 
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quantum transitions. The part of the field originating from the given atom itself is supposed 
to induce a probability that that atom loses energy spontaneously, while radiation from 
external sources is regarded as inducing additional probabilities that it gain or lose energy, 
much as Einstein has suggested. The discontinuous transition finally resulting from these 
probabilities has no other significance than simply to mark the transfer to a new stationary 
state, and the change from the continuous radiation appropriate to the old state to that of 
the new.’ 


Slater’s vision was that the resultant virtual field would determine the paths of the light 
quanta and also determine the probability that they propagate in a particular direction. 

Bohr and Kramers discussed these ideas intensively with Slater on his arrival in Copen- 
hagen. Bohr was still not convinced of the reality of light quanta, despite the evidence of 
Compton’s experiments on the inelastic scattering of X-rays (Sect. 9.1). The results of these 
discussions was the paper by Bohr, Kramers and Slater (1924) which adopted many of 
Slater’s ideas, but not those involving light quanta. They explicitly stated that the laws of 
energy and momentum conservation were only to refer to statistical averages, rather than 
to individual interactions. Two quotations will serve to illustrate the radical nature of the 
proposal. 


‘Further, we will assume that the occurrence of transition processes for the given atom 
itself, as well as for the other atoms with which it is in mutual communication, is con- 
nected with this mechanism by probability laws which are analogous to those which in 
Einstein’s theory hold for the induced transitions between stationary states when illumi- 
nated by radiation. On the one hand, the transitions which in this theory are designated 
as spontaneous are, on our view, considered as induced by the virtual field of radiation 
which is connected with the virtual harmonic oscillators conjugated with the motion of 
the atom itself. On the other hand, the induced transitions of Einstein’s theory occur in 
consequence of the virtual radiation in the surrounding space due to other atoms.’ 


‘As regards the occurrence of transitions, ... we abandon... any attempt at a causal con- 
nexion between the transitions in distant atoms, and especially a direct application of the 
principles of conservation of energy and momentum, so characteristic for the classical 
theories.’ 


Thus, not only has conservation of energy and momentum at the microscopic level been 
abandoned, causality has gone as well. There is no formal working out of these ideas in their 
paper, which Pais refers to as a proposal rather than a theory and also as being ‘obscure in 
style’ (Pais, 1991). 

The Bohr, Kramers and Slater paper presented several experimental and theoretical chal- 
lenges to the community. First of all, it was a challenge to the experimenters to demonstrate 
the validity of the conservation of energy and momentum at the microscopic atomic level. 
As Bohr, Kramers and Slater pointed out, the Compton scattering experiments were based 
upon statistical averages rather than on individual scattering events. Experimental evidence 
on the issue of the validity of the conservation laws was not long in coming. Bothe and 
Geiger carried out Compton scattering experiments using early coincidence techniques, 
in which they measured the arrival times of the scattered X-rays and the ejected K-shell 
electrons (Bothe and Geiger, 1924). The likelihood of the coincidences they observed 
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occurring by chance was less than one part in 10°. This was soon followed by cloud cham- 
ber experiments by Compton and Simon (1925) in which they determined the direction and 
time of ejection of the electron from the target. They verified that one recoil electron is 
produced on average for each scattered X-ray photon. Occasionally a secondary electron 
track was created by the X-ray and from these measurements, the geometry of the collision 
could be determined. They found the ‘unequivocal answer’ that the Compton collisions 
obeyed precisely the expectations of the laws of conservation of energy and momentum at 
the atomic level. As they stated 


*. . . the results do not appear to be reconcilable with the view of the statistical production 
of recoil and photo-electrons by Bohr, Kramers and Slater. They are, on the other hand, in 
direct support of the view that energy and momentum are conserved during the interaction 
of radiation and individual electrons.’ 


Thus, although it was evident that classical dynamics could not account for physics at the 
atomic level, the laws of conservation of energy and momentum still had to hold good. 
Although the Bohr, Kramers and Slater proposal was shown to be incorrect in a remark- 
ably short space of time, the paper itself was influential in that it rejected many of the 
fundamental tenets of classical physics. The paper was widely discussed and illustrated the 
necessity for quite different approaches to physics on the scale of atoms. Van der Waerden 
summarised the three radical elements of the paper as follows (van der Waerden, 1967): 


1. Slater’s concept of a virtual radiation field associated with the virtual oscillators of 
atoms; 

2. The statistical conservations of energy and moment; 

3. The statistical independence of the processes of emission and absorption of radiation. 


The legacy of the paper was that the first postulate was to prove to be correct, whilst 
the second and third were in conflict with experiment. Following the intense discussions 
stimulated by the paper, Kramers went on to make his contributions to the understanding 
of the formula for the phenomena of dispersion, culminating in the major paper with 
Heisenberg which was discussed in the last section (Kramers and Heisenberg, 1925). In 
turn, these considerations were to prove to be crucial for Heisenberg when he made the first 
major advances in the development of a completely new approach to quantum physics. 

Equally important was the fact that in these works the concept of atoms was gradually 
being displaced by a much more abstract concept of how physical processes should be 
envisioned at the atomic level. Jammer noted that the introduction of virtual oscillators and 
virtual radiation fields 


“paved the way for the subsequent quantum-mechanical concept of probability as some- 
thing endowed with physical reality and not merely a mathematical category of reasoning.’ 


Atoms were to be thought of as systems which consisted of virtual oscillators which were 
coupled to other atoms by virtual radiation fields associated with the oscillators. In the 
development which we are about to trace, the concept of atoms themselves became much 
more abstract. This was part of the penalty which had to be paid in coming up with a set of 
rules to replace the laws of classical dynamics. 
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10.3 Born and ‘quantum mechanics’ 
SSS) 


The inadequacy of the old quantum theory to account for physics on the scale of atoms was 
now a central theme of the leading theoretical groups, nowhere more so than at Göttingen 
where Max Born and his colleagues were searching for new approaches to the problem of 
translating the structures of classical physics to the atomic domain. This change of attitude 
is reflected in Born’s important paper of 1924 (Born, 1924). In the introduction to the paper 
he wrote, 


‘Since one knows that, in certain circumstances, atoms react to light waves completely 
“non-mechanically” (i.e. they are excited by quantum jumps), it is not to be expected that 
the interaction between electrons of one and the same atom should comply with the laws 
of classical mechanics; this disposes of any attempt to calculate the stationary orbits by 
using a classical perturbation theory complemented by quantum rules.’ 


He had been strongly impressed by the radical proposals by Bohr, Kramers and Slater and by 
Kramers’ paper in which the dispersion formula had been derived. In particular, Kramers’ 
derivation of the dispersion relation involved the interaction between light and the virtual 
oscillators and resulted in a quantum formulation for the process of dispersion while at the 
same time satisfying the correspondence principle in the long wavelength limit. The object 
of Born’s paper was to extend this formalism to the case of the interactions of electrons 
within atoms. The ambition of the paper was to find out whether or not Kramers’ approach 
to quantisation was founded on some more general properties of perturbed mechanical 
systems at the quantum level. In Born’s words, 


“What we shall do, is to bring the classical laws for the perturbation of a mechanical 
system, caused by the internal couplings or external fields, into one and the same form, 
which would very strongly suggest the formal passage from classical mechanics to a 


299 


“quantum mechanics”. 


For the first time, the term guantum mechanics appears in the literature. 

Born begins by developing the classical theory of the perturbation of multiply periodic 
non-degenerate systems of the type discussed in Sect. 5.5. As discussed in Sect. 5.4, action— 
angle variables provide the natural system of coordinates for such motions. The perturbed 
Hamiltonian is written in the form 


where the perturbation H; is given by a Fourier series. These procedures were by now well- 
established in the literature, particularly because of the success of the formalism in the old 
quantum theory. It was also familiar from the perspective of the dynamics of astronomical 
systems where multiply periodic systems, such as the orbits of the planets, are subject to 
small perturbations. Having set up the formalism, Born shows that, if the perturbation is 
associated with incident electromagnetic waves, the classical expression for the dispersion 
is obtained. 

Born now makes the key innovation of this paper by showing how to translate the 
formalism of classical mechanics to a ‘quantum mechanics’. The argument depends upon 
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two relations which we have already derived. The first is the use of action—angle variables 
as the natural system of coordinates for the description of multiply periodic motions, such 
as the orbits of electrons in atoms. As demonstrated in Sect. 5.5, the orbital frequency is 
given by (5.121), 


ve, (10.21) 


where H is the Hamiltonian which is the total energy in this case. The second is the ap- 
plication of Bohr’s correspondence principle according to which the classical and quantum 
relations should coincide for large quantum numbers and for small changes in the principal 
quantum number n. This was demonstrated for the hydrogen atom in Sect. 6.3. There, the 
key relation (6.18) can be written 


V = Ave, (10.22) 


where a = An and vs is the orbital frequency of the electron in the orbit with principal 
quantum number n, ve = meet / 4h? n°. Classically, (10.22) has a natural interpretation in 
that a = 1 corresponds to the orbital frequency of the electron and w = 2, 3, 4, ... corre- 
spond to the Fourier components of the orbital motion which can always be decomposed 
into a series of harmonics. Combining (10.21) and (10.22), the formula for the frequencies 
of the lines becomes 
sa 10.23 
Va = OVe = OT (10.23) 
Let us now write Bohr’s quantum frequency condition for the action variable J as follows: 
J = nh and so 





ik E E E 18 dak = EO 


(10.24) 
In the limit in which n is very large and t = 1, 2, 3,... small, a Taylor expansion of the 
last term on the right-hand side of (10.24) gives 
1dH dH 
v(n,n—T)= nur AJ and, since AJ = ht, v(n,n—-t)= Tay B (10.25) 


Thus, the classical and quantum formulae (10.23) and (10.24) become the same in the limit 
of large quantum numbers, satisfying Bohr’s correspondence principle. The last term in 
(10.25) can also be written 


v(n,n — tT) = T— = -—. (10.26) 


Born’s key insight was that the translation between the classical and quantum prescrip- 
tions involved replacing the differential t dH /dn, by the difference H(n) — H(n — t). Born 
then postulates that this translation should apply more generally to all transitions between 
the classical and quantum formulations. Thus, for any arbitrary function (n) which de- 
fines the stationary state n, the differential t dọ /dn should be replaced by the difference 
o(n) — p(n — t). Symbolically, 


do(n) 
T 
dn 





<—> $(n)— d(n rT). (10.27) 
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This will play an important role in the future development of the structure of the theory. 
For convenience, Jammer (1989) refers to this equivalence as Born 5 correspondence rule. 

Born then went on to show how, using this formulation, Kramers’ quantum dispersion 
formula (10.19) could be derived. While Born was working on this paper, Heisenberg was 
his assistant at Göttingen and he contributed to the calculations carried out in the paper. As 
Born acknowledges in a footnote 


‘By happy coincidence, I was able to discuss the contents of this paper with Mr. Niels Bohr 
which contributed greatly to a clarification of the concepts. I am also greatly indebted to 
Mr. W. Heisenberg for much advice and help with the calculations.’ (Born, 1924) 


Heisenberg was only 22 years old at the time, but with his quick grasp of physical concepts 
and his technical expertise, he greatly benefitted from these studies with Born. He next 
spent the winter of 1924-1925 in Copenhagen working with Bohr and Kramers. It was 
there that he and Kramers worked out the full quantum theory of scattering and dispersion 
discussed in Sect. 10.1 (Kramers and Heisenberg, 1925). Their rigorous development of the 
theory started with the classical picture and then the transition to the quantum formulation 
was achieved by replacing the differential quotients 9/9 J by the appropriate difference 
quotients, according to Born’s correspondence rule. 

Their result for the polarisation in the state i was given by an expression similar to 
(10.19) which we quoted above from Kramers’ paper (Kramers, 1924), but with a significant 
difference in notation. Using (10.16) to write the polarisation in terms of fp; rather than 
A*, their expression can be written 





2 
er Si Fir 
PR=xE= ——— |. 10.28 
a Ar?me 2 (v2, — v2) 2 (v2, — v2) ( ) 
Ey>E; Ey <E; 


In the classical development f; was the number of dispersing electrons in the state i, whereas 
in the new formulation the fiy are not necessarily integers. They are better described as the 
oscillator strengths, the terminology introduced by Pauli. 

Although the significance of the fis has changed from the classical meaning, there is 
an important relation discovered by Kuhn (1925) and independently by Thomas (1925) 
concerning the sum of all the oscillator strengths of a given state 7. Use is again made 
of the correspondence principle. Let us first simplify the notation by writing fa for the 
Ji, Meaning absorption transitions from the state i and f, for the fiw, meaning induced 
emission terms which must appear according to the analyses of Kramers and van Vleck. 
Consider now the limit in which the frequency v is very much greater than vy; and vix. 
Then, the quantum expression for the polarisation of an atom in a state i becomes 


2 
er 
P=xE=-— |} h- f|. 10.29 
x mar l - fi d 2 ( ) 
The total energy loss rate of an oscillator is given by Thomson’s classical formula 


dE u wte? |x|" u w* | pol? 
dt Jag 12nec? ° 12meoc3 ` 








(10.30) 
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From the classical analysis of the response of an oscillator to an incident electromagnetic 
wave, in the limit w >> wo (Sect. 2.3.3), 


272 
2_ & ko 
f 10.31 
[xol TA ( ) 
and so 
dE 4E? 

= (5) Se (10.32) 

tJa  2reoc’mz 


Now, let us work out the corresponding dispersion loss using the quantum formula (10.29) 
for a single state i within an atom for which the polarisation is P; = po, 


2 
dE w| pol” et EG 
= = = —_ a i 1 f 
( dt IR 127 €9c3 127 €9c3 m2 2 fi 2 f ( 0 33) 


Kuhn argued that, according to the correspondence principle, (10.32) and (10.33) should 
be the same in the classical limit. Therefore, the oscillator strengths for a given state i of 
an atom should obey the rule 





Zs- El =1. (10.34) 
k k’ 


This is the sum rule of Kuhn and Thomas and was to play an important role in the formulation 
of quantum mechanics. 


10.4 Mathematics and physics in Gottingen 





The developments described in this chapter were the last contributions prior to Heisenberg’s 
epochal paper of 1925 which was to lead eventually to quantum mechanics as we know 
it today. The theoretical work was carried out principally in Göttingen and Copenhagen, 
under the guidance of a number of distinguished experimental and theoretical physicists. 
The Copenhagen Institute for Theoretical Physics was a very recent foundation, thanks to 
the efforts of Bohr, and very quickly became an international centre of excellence in all 
aspects of quantum physics. The most distinguished theorists visited Bohr regularly and 
were put through the most rigorous examinations of their researches, principally by Bohr 
himself. 

The development of mathematical, theoretical and experimental research in Göttingen 
was very different.* There had been chairs of physics and mathematics in the University of 
Göttingen since the mid-eighteenth century. The great tradition of mathematics research was 
established by Carl Friedrich Gauss who became Professor of Mathematics in 1807. Gauss’s 
vast contributions to mathematics need not be elaborated here, except to note that he strongly 
encouraged the application of mathematics to physical problems and made many important 
contributions. Wilhelm Weber was appointed to the Professorship of Physics in 1831 and 
was at the forefront of the theoretical attack upon the problems of electromagnetism. 
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Gauss was succeeded by a galaxy of distinguished mathematicians including Lejeune 
Dirichlet, Bernhard Riemann, Alfred Clebsch and Felix Klein — Arnold Sommerfeld was 
a student of Klein. In a separate appointment, David Hilbert became Director of the 
Mathematics Institute in 1895. Four new institutes were founded in 1905 to complement 
the existing Physics and Mathematics Departments and the Astronomical Observatory, 
the Director of which was Karl Schwarzschild. The new institutes were the Geophysical 
Institute, the Institute of Applied Electricity, the Institute of Applied Mechanics and the 
Institute of Applied Mathematics, the respective directors being Emil Weichert, Hermann 
Theodor Simon, Ludwig Prandtl and Carl Runge. 

David Hilbert’s appointment as Professor of Mathematics led to a remarkable flowering 
of research in mathematics, what Mehra and Rechenberg describe as 


‘a school of phenomenal brilliance, which had never been equalled before in the history 
of mathematics.” 


In 1900, Hilbert enunciated his 23 great problems of mathematics, which were to challenge 
the greatest mathematicians for the next century (Hilbert, 1900). Hilbert was not only 
interested in pure mathematics, however, but also in its application to physics and in 1902 
he laid out his plans for a closer involvement of mathematics in physics. In 1903, he attracted 
his old friend from Königsberg, Hermann Minkowski, to the Chair of Mathematics but sadly 
Minkowski died in 1909 of appendicitis. This event interrupted Hilbert’s plans for physics, 
but his endeavours were greatly helped by the endowment of the Wolfskehl Prize for the 
proof of Fermat’s last theorem, which has already appeared in our story in connection 
with Bohr’s Wolfskehl lectures of 1922 (Sect. 8.2). While Fermat’s last theorem remained 
unsolved, the Mathematical Institute was allowed to use the interest on the capital, which 
amounted to about 5,000 marks per year, to endow the series of Wolfskehl lectures. Hilbert 
was encouraged by his colleagues to submit a solution to Fermat’s last theorem, but Hilbert 
refused, remarking 


“Why should I kill the goose that lays the golden egg?’ 


Hilbert and Klein wanted to keep abreast of developments in physics, remarking that 
‘Physics is much too hard for physicists’. As discussed in Sect. 8.2, Hilbert used the interest 
on the Wolfskehl endowment to invite distinguished lecturers in physics to Göttingen on a 
regular basis. Poincaré delivered the lectures in 1909, Lorentz in 1910 and Sommerfeld in 
1912. The sixth of the 23 problems which Hilbert had formulated in his 1900 lecture was 
‘to establish the axioms of theoretical physics’. 

In pursuit of this objective, he invited Einstein in the summer of 1915 to lecture on the 
development of the general theory of relativity. In late 1915, just weeks after the appear- 
ance of Einstein’s great paper of 1915, Hilbert independently derived the field equations 
of general relativity (Hilbert, 1915). In fact, he went significantly further. In his axiomatic 
approach, he assumed that, as in general relativity, at each point in space-time 10 gravita- 
tional potentials g,» (u, v = 1, 2, 3, 4) are required to describe the gravitational field and 
four potentials A, (u = 1, 2, 3, 4) for the electromagnetic field. From these potentials, 
he formulated a Hamiltonian, a ‘world-function’ H, which depended on the g,,, and their 
first and second spatial derivatives and the A,, and their spatial derivatives. The first axiom 
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asserted that the laws of physics should be determined by taking the variations of the inte- 
gral [ H/E dx*, where g is the determinant of the metric tensor g,,,. The second axiom 
was the ‘axiom of general invariance’ according to which H must be invariant with respect 
to arbitrary coordinate transformations. From these axioms, Hilbert showed that general 
relativity and a generalised form of Maxwell’s equations could be derived. Klein went on 
to unify the different analyses of energy-momentum conservation in general relativity by 
Einstein, Hilbert and Lorentz. In a further brilliant analysis, Emmy Noether, a student of 
Hilbert and Klein, established the relationship between conservation laws and symmetry 
(Noether, 1918). 

Hilbert and Klein took the view that this field-theoretic approach had solved Hilbert’s 
problem 6 and provided the basis for the future development of theoretical physics. Although 
Einstein admired greatly Hilbert’s deep understanding of mathematical physics, he was not 
an enthusiast for Hilbert’s approach to field theory. Nonetheless, he continued to develop 
the field-theoretic approach and this resulted in his long, and in the end fruitless, endeavours 
to find a unified field theory. It would turn out that the breakthrough did not come from the 
Einstein-Hilbert approach, but from a quite different route. 

In 1919, Klein retired from his position as Professor of Mathematics because of ill health 
and was succeeded by Richard Courant, who was not only an outstanding mathematician, 
but also an excellent organiser. He began a series of monographs on Fundamental Topics 
of Mathematical Sciences in Monographs. Significantly, No. 12 of the series was the first 
volume of the famous monograph The Methods of Mathematical Physics by Courant and 
Hilbert (1924). The book was based upon those aspects of Hilbert’s lectures which found 
application in theoretical physics. This monograph set out systematically the mathematical 
tools needed to tackle the mathematical physics of quantum theory. Courant also developed 
ambitious plans for a mathematical institute similar to the institutes for the experimental 
sciences and his success in this endeavour resulted in the opening of the Mathematical 
Institute in 1929. 

In physics, the older generation was represented by Rieche and Voigt but they supported 
the researches of the new generation of physicists which included Max Abraham, Johannes 
Stark and Walther Ritz. In 1920, Rieche retired and was replaced by Pohl. In the same year, 
Debye left for Zurich and by good fortune it proved possible to appoint simultaneously 
James Franck to the chair of experimental physics and Max Born to the chair of theoretical 
physics. 

This remarkable battalion of mathematicians and physicists was about to tackle the quite 
different horizons opened up by Heisenberg’s new insights. Once the way was shown, 
the strength of mathematical physics meant that the development of the theory proceeded 
rapidly. The glory years were to continue until 1933 when the Nazi prohibition on those 
of Jewish descent from holding university positions wreaked havoc with the brilliance of 
Göttingen mathematics and physics. From the Göttingen faculties alone, Hermann Weyl, 
Richard Courant, Edmund Landau, Emmy Noether, Max Born and James Franck all emi- 
grated in the face of Nazi oppression. 
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11.1 Heisenberg in Gottingen, Copenhagen and Helgoland 
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Werner Heisenberg studied under Sommerfeld in Munich and was present at the Bohr 
Festspiele held in Göttingen in 1922 (Sect. 8.2). Although aged only 20, he challenged 
Bohr’s support of Kramers’ analysis of the quadratic Stark effect, having studied the paper 
in detail for Sommerfeld’s seminar in Munich. The result was a long walk with Bohr during 
which they discussed this topic and the more general problems of quantum physics. This 
encounter made a strong impression on Heisenberg. Much later Heisenberg stated: 


“That discussion, which took us back and forth over Hainberg’s wooded heights, was the 
first thorough discussion I can remember on the fundamental physical and philosophical 
problems of modern atomic theory, and it has certainly had a decisive influence on my 
later career. For the first time, I understood that Bohr’s view of his theory was much more 
sceptical than that of many other physicists — for example, Sommerfeld — at that time, and 
that his insight into the structure of the theory was not a result of mathematical analysis 
of the basic assumptions, but rather of an intense occupation with the actual phenomena, 
such that it was possible for him to sense the relationships intuitively rather than derive 
them formally.’ 


Heisenberg spent the winter of 1922-1923 working in Göttingen as Born’s assistant. 
The astronomers had made great progress in the use of perturbations techniques within the 
action—angle formulation of classical dynamics to study the gravitational perturbations of 
planetary orbits. Born and Heisenberg adapted these procedures to the case of the orbits of 
electrons in atoms, in particular, the problem of the two-electron helium atom. Although 
some qualitative features of the helium atom could be accounted for, the quantitative results 
did not agree with experiment (Born and Heisenberg, 1923a,b). 

Born was so impressed by Heisenberg’s abilities that, following Pauli’s departure from 
Göttingen, he asked Sommerfeld if he would release Heisenberg to become his assis- 
tant once he had completed his doctoral dissertation in Munich. Sommerfeld agreed to 
this. Heisenberg completed his PhD dissertation on turbulence in Munich and returned to 
Göttingen in October 1923 as Born’s assistant. Born has provided a portrait of Heisenberg 
at that time. 


‘He looked like a simple peasant boy, with short, fair hair, clear bright eyes and a charming 
expression. He took his duties as an assistant more seriously than Pauli and was a great 
help to me. His incredible quickness and acuteness of apprehension has always enabled 
him to do a colossal amount of work without much effort: he finished his hydrodynamical 
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thesis, worked on atomic problems partly alone, partly in collaboration with me, and 
helped me to direct my research students.’ (van der Waerden, 1967) 


In the course of his studies with Sommerfeld in Munich and with Born in Göttingen, 
Heisenberg perfected his use of their approaches and techniques, in which well-defined 
mechanical problems were tackled systematically according to the precepts of the old 
quantum theory. He had already encountered Bohr’s quite different approach during the 
1922 Wolfskehl lectures. Bohr had also been deeply impressed by Heisenberg’s abilities 
and invited him to Copenhagen for a visit in March 1924. Bohr took a personal interest 
in Heisenberg’s intellectual development which was deepened during a three-day walking 
tour through the island of Zealand. Whereas Sommerfeld and Born were interested in the 
solution of well-posed problems which were solved by rigorous mathematics, Bohr was 
much more interested in the underlying physical concepts which he related to the results 
of experiment. In this, he was much more akin to Einstein for whom physical concepts 
were the basis of theory which were then elaborated using the appropriate mathematical 
tools. Heisenberg appreciated that Bohr was thinking deeply and philosophically about the 
underlying physical concepts which had to be incorporated into a new system of quantum 
mechanics — Heisenberg’s specific interests in the Zeeman effect and the problems of 
multi-electron atoms paled into insignificance in the light of Bohr’s deeper concerns. 

During early 1924, Bohr was preoccupied with the development of the Bohr-Kramers- 
Slater theory which sought to incorporate the concept of light quanta into the Bohr- 
Sommerfeld picture of the atom, at the expense of the exact conservation of energy and 
momentum in interactions between matter and radiation (Sect. 10.2) (Bohr et al., 1924). 
At the same time, Kramers was developing his improved version of Ladenburg’s theory of 
dispersion by replacing the atom by a collection of virtual oscillators (Sect. 10.1) (Kramers, 
1924). The collaboration between Copenhagen and Göttingen intensified during this period 
and, during a visit by Bohr to Gottingen, he invited Heisenberg to visit for a more extended 
period during the winter semesters of 1924-1925. Heisenberg duly arrived in Copenhagen 
in September 1924. 

Kramers, who was five years older than Heisenberg, had been a collaborator of Bohr’s 
since 1916 and had become Bohr’s principal assistant when the Institute for Theoretical 
Physics in Copenhagen was opened in 1920. Initially, Heisenberg was somewhat in awe of 
Kramers’ sophistication and technical ability but they soon became good friends. During 
this longer visit, Heisenberg fully absorbed the ‘Bohrian’ approach and, in particular, 
became an adherent of the correspondence principle and its refinement in tackling the 
problems of quantum physics. It was during this period that Kramers and Heisenberg 
developed the much more complete quantum picture of the dispersion of light incorporating 
Born’s correspondence rule (Sect. 10.3) (Kramers and Heisenberg, 1925). Also, during his 
visit, Heisenberg revisited the thorny problems of the anomalous Zeeman effect and the 
interpretation of complex spectra. 

In April 1925, Heisenberg returned to Germany, taking first a well-deserved break in 
Southern Germany before taking up his position as a Privatdozent at Göttingen at the 
beginning of the next semester. With Heisenberg, Pascual Jordan and Friedrich Hund, Born 
had assembled a very powerful team for theoretical physics and spectroscopy in Göttingen. 
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In the meantime, Bohr had become more pessimistic about his recent researches. The 
Bohr-Kramers-Slater picture was no longer tenable in the light of the experiments of 
Bothe and Geiger (1924) and he lost faith in the models of multi-electron atoms and 
complex spectra, which were the basis of Heisenberg’s researches at Copenhagen. As a 
result, Heisenberg changed the direction of his research to the study of the intensities 
of the spectral lines of the hydrogen atom. His reasoning was that hydrogen is a single 
electron system and so is not subject to the difficulties of helium or multi-electron atoms. 
His intention was to apply Born’s correspondence rule for quantisation to the radiation of 
the electron in circular and elliptical orbits about the nucleus. To achieve this, he had to 
develop Fourier series in two dimensions for elliptical orbits and this programme rapidly 
became prohibitively complex. Instead, he turned to a simpler problem, the emission of a 
nonlinear oscillator. As we will discuss in Sect. 11.5, the nonlinear oscillator has the feature 
that higher harmonics of the fundamental frequency can be found by recursive formulae. 
Some of these ideas were foreshadowed in correspondence with Kronig in 1922. 

By 1925, as the old quantum theory seemed to be irretrievably incapable of accounting 
for the experimental data, Heisenberg decided that a more radical approach was needed, 
inspired by the success of Ladenburg’s, Kramers’ and Born’s quite different approach to the 
interaction of radiation with atoms. As discussed in Chap. 10, the orbits of electrons in atoms 
were dispensed with and replaced by virtual oscillators with frequencies corresponding to 
the observed lines in the spectra of atoms. In an inspired application of the correspondence 
principle, Heisenberg argued that the Born approach to quantisation should be applied, not 
only to quantities such as the frequencies of the virtual oscillator, but also to the kinematics 
of the electron itself. The great, and profound, insight was that the spatial position of 
the electron x(t) should be subject to the Born correspondence rule. In other words, the 
classical kinematics of Galileo and Newton had to be replaced by their quantum theoretical 
counterparts. The breakthrough occurred in June 1925 when Heisenberg suffered a severe 
bout of hay fever. To recover, he took a break on the barren German island of Helgoland 
in the North Sea — with its offshore climate, the island is almost free of pollen. During 
the nine or ten days he was there, he worked out the mathematics of these ideas. On his 
way back to Göttingen, he discussed his new ideas with Pauli in Hamburg, and then in 
Göttingen wrote up his paper on the subject, sending it to Pauli for his urgent consideration. 
Although notoriously critical, Pauli was impressed. Born recognised the significance of what 
Heisenberg had achieved and submitted it for publication to the Zeitschrift fiir Physik. This 
paper marks the beginning of the major change of direction which was needed to develop 
a consistent theory of quantum mechanics. Let us investigate exactly what Heisenberg did. 


11.2 Quantum-theoretical re-interpretation of kinematic and 


mechanical relations (Heisenberg, 1925) 
N) 


Heisenberg’s great paper of 1925 is very far from a final polished theory of quantum 
mechanics, but rather an exploration of a route to the resolution of the intractable problems 
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faced by the old quantum theory (Heisenberg, 1925). Mehra and Rechenberg (1982b) point 
out the remarkable similarities to Einstein’s great paper of 1905, On the electrodynamics of 
moving bodies (Einstein, 1905c), which established the new framework of special relativity 
and which abolished the concepts of absolute space and time. In contrast to Einstein’s paper, 
which is a complete exposition of the underlying physics of the special theory of relativity, 
Heisenberg’s paper is a work in progress. 

Heisenberg’s summary at the beginning of the paper is as follows: 


“The present paper seeks to establish a basis for theoretical quantum mechanics founded 
exclusively upon relations between quantities which in principle are observable.’ 


He begins by reviewing the well-known problems of the old quantum theory and states 
unambiguously that 


‘the Einstein-Bohr frequency condition (which is valid in all cases) already represents 
such a complete departure from classical mechanics, or rather... from the kinematics 
underlying this mechanics, that even for the simplest quantum theoretical problems the 
validity of classical mechanics cannot be maintained.’ 


But he has a further concern. The old quantum theory is based upon calculations of the 
orbits and orbital frequencies of electrons in atoms and he contends that these are unob- 
servable quantities. The quantities which are observable are the frequencies and intensities 
of emission and absorption lines — the positions of electrons in atoms are in principle un- 
observable. Much later, this concept would be incorporated into Heisenberg’s uncertainty 
principle, but that was still a long way in the future. He goes on: 


‘In this situation, it seems sensible to discard all hope of observing hitherto unobservable 
quantities, such as the position and period of the electron, and to concede that the partial 
agreement of the quantum rules with experience is more or less fortuitous. Instead, it 
seems more reasonable to try to establish a theoretical quantum mechanics, analogous to 
classical mechanics, but in which only relations between observable quantities occur.’ 


Heisenberg later stated that he had been strongly influenced by Einstein’s paper on special 
relativity of 1905 in his insistence upon considering only observable quantities. The crucial 
first part of Einstein’s paper is entitled Kinematical part and demolishes Newton’s concepts 
of absolute space and time. Space and time need to be defined in terms of measurable 
quantities with the result that simultaneity becomes relative, the key concept of the relativity 
of simultaneity (Rindler, 2001). By using the two principles of relativity, namely that the 
laws of physics should be form-invariant between inertial frames of reference and that 
the speed of light is a constant in all such frames, Einstein discovered operationally self- 
consistent definitions of space and time. To express this in another way, what had gone 
wrong with Newtonian physics was the description of the kinematics of particles, that is, 
the way in which points in four-dimensional space-time are described. The new kinematics 
of particles and light waves are fully derived in this first kinematical part of Einstein’s great 
paper and are embodied in the Lorentz transformations. 

Heisenberg was about to carry out a similar revolution in the concept of kinematics on the 
scale of atoms. He was to be guided by the insights gained from Kramers’, Born’s and his own 
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studies of dispersion phenomena. In particular, he was to use Born’s correspondence rule 
to create a quantum mechanics in which only relations between observables are permitted. 

Heisenberg’s paper is not the easiest to comprehend. As we will see, even physicists 
such as Enrico Fermi and Steven Weinberg! have found the content difficult and somewhat 
obscure. The problem is compounded by changes of notation which Heisenberg adopted 
in the course of the paper.” Another problem is that Heisenberg does not describe how the 
results he presents were obtained. These problems are addressed and resolved in an impres- 
sive paper by Ian Aitchison and his colleagues (2004) which will be used in the exposition 
which follows. In the following sections, I have translated Heisenberg’s arguments into a 
single self-consistent notation which I hope does justice to the revolutionary content of the 


paper. 


11.3 The radiation problem and the translation 


from classical to quantum physics 
= ee 


In the next part of the paper (Sect. 1), the basic postulates of the new approach are laid out, 
in particular, the way in which the correspondence rule is to be used to translate between 
classical and quantum physics. Heisenberg begins with the formula for the intensity of 
emission in classical electrodynamics. First of all, he notes that the standard results for 
dipole radiation are only the first step in the evaluation of higher order terms associated 
with quadrupole and higher multipoles of the acceleration history of the electron. These 
can all be determined by first taking the Fourier transform of the motion of the electron and 
then working out the intensity of radiation associated with each Fourier component. But, 
he has a deep insight — in his words, 


“The point has nothing to do with electromagnetism but rather — and this seems particularly 
important — is of a purely kinematic nature. We may pose the question in its simplest form 
thus: If instead of a classical quantity x(t) we have a quantum-theoretic quantity, what 
quantum-theoretic quantity will take the place of x(t)*.’ 


We can appreciate his line of thinking by recalling the expression for the dipole radiation 
of an oscillator. In the simplest case of dipole radiation, the radiation rate is given by (2.1), 


namely 
dE .. 2: 2 .. 2 
_ ($) aoM o aa (11.1) 
dt Jad OTEC OTE? 





For the case of an oscillator performing simple harmonic oscillations with amplitude x9 at 
angular frequency wo, x = |xo| exp(iw@ot) and then the average radiation loss rate is given 


by (2.3), 
dE 42 2 
u (>) _ 2e lolt (11.2) 
dt J average 12m eoc? 
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Let us note some features of this expression which were important for the pioneers of 
quantum mechanics. The electric field strength at distance r from the accelerated charge in 
the far field limit is given by the expression 


plsind elx|sin@ ew2|xo| sind 
_ IP a _ eM 





Eo (11.3) 


— Anege@r  4regoe?r Anger 
where p = er is the dipole moment of the electron. The radiation loss rate (11.1) is 
obtained from (11.3) by integrating the Poynting vector flux S = |E x H| = E} /Zo of the 
emitted radiation over 47 steradians, that is, over 5 sin@ dð, where Zo = (uo/ eo)" 2 is the 
impedance of free space. The total radiation rate can be related to Einstein’s coefficient 
A” for spontaneous emission by the following argument. We recall that A”, is defined for 
the isotropic emission of the oscillator, whereas (11.2) corresponds to the dipole emission 
of a single oscillator. Just as we argued in Sect. 2.3.3, we obtain the isotropic result if we 
consider the total loss rate to be associated with three mutually perpendicular oscillators. 
Then, we find 


wpe?|xol? ji exol? 
1273 2 2egc’h 


Note the key result that 47, is the probability of spontaneous emission and that it depends 
upon the square of the amplitude of the oscillator |xo|?. This is why Heisenberg asks the 
question, ‘... what quantum-theoretic quantity will take the place of x(t)*?’ 

There is one further link between |xo| and electrodynamics. In the classical theory of the 
emission of electromagnetic waves, it is simplest to derive the expression for the electric 


field strength E of the wave from the vector potential A using the relations 


hv A” =3 x 


m 


(11.4) 


Boat ee E- 9A _ er _ ev l (11.5) 
An r An r ot Am eqc2r 4T eqc2r 





Thus, for an oscillator of angular frequency wo, 
aA 

E| = —|—| =alA]. 11.6 

|E| | zz | = 21A] (11.6) 





Thus, comparing (11.3) and (11.6), the magnitude of the vector potential is also proportional 
to |xo|, the maximum displacement of the oscillator. 

After this mild detour into classical electrodynamics, let us return to the development of 
Heisenberg’s paper. He takes the radical point of view that what had gone wrong with the 
old quantum theory is that the spatial coordinates x9 should be replaced by their quantum 
analogues. This is a dramatic step. As remarked by Jammer (1989), 


‘Heisenberg, influenced by both Sommerfeld and Bohr, considered now the possibility 
of ‘guessing’ — in accordance with the correspondence principle — not the solution of a 
particular quantum-theoretic problem but the very mathematical scheme for a new theory 
of mechanics. By integrating in this way the correspondence principle once and for all in 
the very foundations of the theory, he expected to eliminate the necessity of its recurrent 
application to every problem individually without jeopardising its general validity.’ 


A key difference which has to be built into this translation is that, according to classical 
electrodynamics, the radiation rate depends only upon one quantum number, the principal 
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quantum number n which determines the orbital frequency of the electron — there may also 
be higher harmonics of the orbital frequency, but essentially the radiation depends only on 
orbital frequency. This is formalised by (10.23) in which the classical frequencies of the 
emission are given by 





dw 
vlo) = avo = a7 ; (11.7) 
where « is an integer, a = 1, 2,3,.... In the old quantum theory, the orbit with principal 
quantum number n has action variable J = nh and then (11.7) becomes 
dw 1dW 
,&) = = 4 — =a-—. 11.8 
v(n, a) = av(n) =a JJ a7 dn ( ) 


In contrast, Bohr’s frequency relation necessarily depends upon the properties of two 
states through the relation 





v(n,n = HW) W(n-r)], (11.9) 


where the Ws represent the binding energies of the stationary states and t = 1, 2, 3,.... 

There is also a contrast in the ways in which combinations of frequencies are added in 
classical and quantum theory. Classically, for the case of radiation of harmonics a and ß 
associated with principal quantum number n, the combination rule is 


vin, a)+ v(n, B) = v(n,at+ B). (11.10) 


This type of combination of frequencies occurs in the nonlinear coupling between the 
harmonics g and £. The corresponding quantum combination rule for the virtual oscillators 
associated with transitions from the stationary state n to n — t’ and then to n — t’ — T is 
Ritz’s combination principle (Sect. 1.6) which Heisenberg writes: 


v(n,n-T)+v(n-rT,n-tT-r)=v(nn-tT-r), (11.11) 
vn—-tn-tT—-T)+V(nn-r)=v(nn-rt-r). (11.12) 


Notice the important point that at the atomic level, the classical frequency rule corresponds 
to the frequency of the electron’s orbit and its harmonics, whereas the quantum theoretical 
rule does not depend upon the orbital frequency, but upon the energy differences between 
the three states defined by n, n — t’ and n — t’ — t; the associated frequencies in (11.11) 
and (11.12) are all observables. This completes the first part of Heisenberg’s agenda of 
working only in terms of observable quantities. 

The second part is to work out the intensities and polarisations of the radiation emitted 
in the quantum transitions. Classically, the motion of the electron in an orbit with principal 
quantum number n can be represented by a Fourier series expansion in harmonics of the 
orbital frequency, 


x(n, ) = > awe (11.13) 


a 


where the sum extends over all positive and negative integral values of œ, that is, >, is 


taken to mean peers From the requirement that x(n, £t) must be real, it follows that 


x(n, —a) = x*(n, a), where x*(n, œ) is the complex conjugate of x(n, œ). 
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To find the value of |x*| in the classical case, we multiply together the Fourier series 
for x(n, t) and its complex conjugate, obtaining a double sum over the products of the 
components of x(n, œ). Since the Fourier series extends from —oo to +00 and the as are 
integers, we obtain the same result if we write 


x(n, t) = >, x(n, a’) iet — 5 x(n, of — a) meer (11.14) 


1 2 


a a 


The complex conjugate of the last expression is )>,, x(n, a — a’) etena- and so we can 
write for |x?| the double sum over a’ and a, 


x(n, t) = [Eroa de x [Eroa — gaa . (11.15) 


a’ a 


Multiplying out the components, the double sum can be rewritten 


x(n, t) = a > x(n, a) x(n, a — a’) boa"! poea . (11.16) 


1 


a a 


Because of the rule (11.10), it immediately follows that the product of exponents in (11.16) 
is el»)! and so the double sum can be reduced to the form 


Sint.) xO, a) i0, (11.17) 


where 


x(n, a) =) x(n, a')x(n, a — a). (11.18) 


a’! 


From Heisenberg’s perspective this is the key result since x?(n, t) can now be represented 
by the sum over Fourier components, or ‘virtual oscillators’, with frequencies aw(n) which 
are directly related to the intensities of radiation through expressions of the form (11.2), in 
other words, to the intensities and polarisations of the ‘virtual oscillators’. 

Heisenberg now seeks the quantum theoretical equivalent for x(n, f). According to Bohr’s 
correspondence principle, the quantum theoretical frequency v(n, n — T) corresponds to 
v(n, œ) and so by analogy the quantity x(n, n — T) corresponds to the Fourier component 
x(n, a). This translation can be written symbolically 


v(n, a) <—> v(n,n—Tt) andso x(n,a) <> x(n,n— T). (11.19) 


Notice that I am maintaining a clear distinction between the as, which represent harmonics 
of the fundamental frequency wo, and t which labels different stationary states of the 
electron in the atom. But this procedure immediately encounters the problem that there is 
no unique way in which x(t) can be represented by a sum over the quantities x(n, n — T) 
which depend upon two variables. Nonetheless, Heisenberg presses on with the following 
statement: 


“However, one may regard the ensemble of quantities x(n, n — tT) exp[iwm(n, n — T)t] asa 
representation of the quantity x(t) and then attempt to answer the above question: how is 
the quantity x?(t) to be represented?’ 
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Consequently, Heisenberg seeks the quantum equivalent of (11.17) which can be written 


x(n, t) = > xO, n — t) explio(n,n —T)]). (11.20) 


The next step is to find the quantum theoretical equivalent of (11.18) for x(n, n — T). 
These will be the quantities which can be inserted into the classical formula for dipole 
radiation and so determine the intensities of the transitions with angular frequency w(n, n — 


T). 
Let us make the replacements (11.19) in (11.16). Then, the t component of the outer 
sum, the quantity in square brackets in (11.16), becomes 


X xn, n— T)x[n,n — (t — T')] explio(n, n — t^t] explio(n,n — (t — t^))t] . 
(11.21) 
The problem with this replacement is that it does not result in terms with frequency 
æ(n, n — T). The quantum frequency addition rule (11.11) derived from Ritz’s combination 
principle is 

o(n,n-T)+w(n-tT,n-tT-rt)=o(nn-r-r), (11.22) 

and so the sum in the exponential term in (11.21) 
o(n,n—t')+a[n,n—(t—T’)] (11.23) 


is clearly unacceptable. Heisenberg was therefore driven to the ‘almost cogent’ conclusion 
that the sum of the frequency factors had instead to satisfy the relation 


o(n,n—t)=a(n,n—t')+a0(n—Tt',n—-T). (11.24) 
Consequently, (11.21) had to be rewritten as follows 


x, n—-tT)x(n-rt,n-r) explio(n,n — t^t] explio(n — t’, n — r)t] 


T 


f 


= (£ x(n,n — t')x(n— Tt’, n— 0) e&xplio(n,n — T)t]. (11.25) 


T 


The quantum theoretical expression for x(n, n — t) therefore has to be 





xnn—t)= Yo x(nn-Tyein-T,n-r), (11.26) 


T 


rather than (11.18). This is the new quantum theoretical multiplication rule which replaces 
the classical expression (11.18). The quantities x(n, n — Tt) can be thought of as transition 
amplitudes since the square of their moduli determine the transition probabilities between 
the states n and n — T. 

Heisenberg immediately carried out the same procedure to determine x3(n) and also 
wrote down the expression for the product of two quantities x, and y,. If we write 


x(n, a@) > x(n,n — T) expfio(n,n—Tt)t], y(n,a) <> y(n,n — r)explio(n,n — T)t], 
(11.27) 
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then, in general, x,y„ Æ YnXn. This can be understood by writing out the product as a sum 
over the components t’ according to the prescription (11.25). Then, in general, 


Sox, n—-T)yn-tT,n-rt)£ X vo, n—vw')x(n—t',n—T). (11.28) 


T 


The new multiplication rule of quantum mechanics is non-commutative and this result 
seriously worried Heisenberg. It turned out, however, to be a key innovation of Heisenberg’s 


paper. 


11.4 The new dynamics 





Undaunted by the puzzle of non-commuting variables, Heisenberg proceeded to the second 
section of his paper which deals with the mechanics of particles incorporating the new 
kinematics. According to the procedures of the old quantum theory, the analysis begins 
with the classical equation of motion, for example, 


E+ f(x) =0, (11.29) 


and this equation is solved classically. The phase integral J = $ pdq = nh is then 
introduced to satisfy the quantum conditions. In the one-dimensional case, p = mx, 
dq = dx = x dt and so 


J =f mit d= nh. (11.30) 
Heisenberg now evaluates J according to classical physics and then translates the result to 


quantum physics using the new kinematics. First, x is written classically as a Fourier series, 
exactly as in (11.13): 


x = x(n, ) = >) x(n, a) i0 , (11.31) 
It follows that 
+00 , 
mš =m > x(n,a)iow(n) i” , (11.32) 
—00 


We now form x?, by multiplying x by its complex and integrating over a period of the 
motion t = 2rr/w(n). All the products of the series except those involving products such 
as x(n, @)x(n, —a) are zero on integrating over one period. We also recall that x(n, —a) 
conjugate is the complex conjugate of x(n, œ) and so the integral over one period of the 
motion 1s 


+00 +00 
J= pm x? dt = 2mm X x(n, a)x(n, —a) a’ o(n) = 21m >» Ix(n, a)l a wn) 


(11.33) 
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Now the quantum condition J = nh is applied, but Heisenberg immediately takes 
the derivative of nh with respect to n, preparing the way for the application of Born’s 
correspondence rule, 


d df o 
+00 d 
h=2nm 2 G [aa(n)|x(n, a)l] . (11.35) 


Born’s correspondence rule (10.27) tells us that differentials are to be replaced by differences 
according to the recipe 
d¢(n) 
a 
dn 


This had been generalised by Kramers and Heisenberg (1925) in their theory of dispersion 
as follows: 





<> dm)- dm). (11.36) 


doin, T) 
qs 
dn 
In making this translation, the angular frequency of the œ harmonic and the amplitude 
x(n, œ) are translated as 


<> o(n+t,n)—o(n,n—T). (11.37) 


o(n+T,n): aw(n) <> w(n,n+t) ; x(n,a) <> x(n,n+r), (11.38) 
o(n —T,n): aw(n) > w(n,n—-T) ; x(n,a) <> x(m,n—T). (11.39) 


Consequently, (11.35) becomes 


+00 


h=4nm > [Ix(n,n +t) o(n,n +t) - |x(n,n - r)’w(n,n-r)] , (11.40) 
0 


where we have used the fact that x is real to convert the sum from —oo to +00 to twice 

that from 0 to +00 since |x(n, T)|? = |x(n, —T)|*. Furthermore, the values of the xs can be 

determined absolutely since there is a requirement that, in the ground state ng, no emission 

processes are possible to a lower energy state, in other words, x(no, no — T) = 0 fort > 0. 
These are remarkable calculations, as Heisenberg realised: 


“Equations [(11.29)] and [(11.40)], if soluble, contain a complete determination not only 
of frequencies and energy values, but also of quantum theoretical transition probabilities. 
However, at present the actual mathematical solution can be obtained only in the simplest 
cases.’ 


The concept is that the equation of motion is solved in terms of Fourier components and 
then these values entered into the quantum theoretical relation (11.40) which incorporates 
the quantisation conditions. But much more is obtained since the quantities x(no, no — T) 
are directly related to transition probabilities. Heisenberg noted in a footnote that (11.40) 
can be simply reduced to the Kuhn—Thomas relation by relating the values of x (no, no — T) 
to Einstein’s A coefficient for spontaneous emission and to the values of the oscillator 
strengths f;. This can be demonstrated as follows. 
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We have already derived the relation between the Einstein A coefficient and f; in (10.16), 
namely, 


3 
= An. (11.41) 





We also showed in (11.4) how A", can be related to |xo|?, 





32, 12 
an = C0 Veol (11.42) 
2egcsh 
Inserting this value for A”, into (11.41), we find the relation 
Ixol’@o = fi. (11.43) 


e 


Now, we make use of the correspondence rule to write 








h 
[1x0 |"wola <— |x(n,n + t)? oln, n+T)= A Jas (11.44) 
Me 
h 
[|xo|2@]e <> Ix(n,n — Ton, n-T)= fe. (11.45) 
TMe 


Inserting these relations for absorption from and induced emission into the quantum state n 
into (11.40), Heisenberg demonstrated that the new formalism resulted in the Kuhn-Thomas 
formula (10.34)? 

This simple reduction illustrates how the new formalism operates entirely in terms of 
observables, namely, the energies of the stationary states, the frequencies of emission and 
the transition probabilities between states. 


11.5 The nonlinear oscillator 
| 


Heisenberg realised that his new formalism could only be applied to the simplest cases. He 
had already attempted to quantise the orbits of an electron in the hydrogen atom, but even 
this case proved to be prohibitively complicated. Instead, in the third section of his paper, 
he treats first the case of the nonlinear oscillator and then a rotator in which an electron 
orbits the nucleus at constant distance a. This is the most difficult section of the paper 
since Heisenberg gives no indication of how he obtained the results quoted. The paper 
by Aitchison and his colleagues (2004) makes a convincing case for the route Heisenberg 
probably followed to obtain these results and we will follow that exposition. The working 
out requires a great deal of algebraic manipulation, but, as we have already noted, Born 
commented that Heisenberg’s 


“...incredible quickness and acuteness of apprehension has always enabled him to do a 
colossal amount of work without much effort.’ (van der Waerden, 1967) 


On first reading, the reader may prefer to jump to the results (11.93)-(11.97) — although 
not easy reading, the following calculations have the virtue of demonstrating Heisenberg’s 
remarkable technical fluency. 
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The classical equation of motion of a nonlinear oscillator can be written 
E+apx tax? =0, (11.46) 


where àx? is the nonlinear term and A is a small quantity. The advantage of treating this 
oscillator over a purely harmonic oscillator is that the Fourier series to represent its motion 
automatically contains a complete set of harmonics of the fundamental frequency wo. The 
form of Fourier series solution chosen by Heisenberg is as follows: 


x = Aap + a) cos wt + Aa2 cos 2@t + Va; cos3w +--+ Ala, costwt+-:::, 


(11.47) 

where the coefficients ag, a), a2, .. . , Ar, . .. and œw are also expanded as power series in A: 
ao = ay + Aa? + Va? +... , (11.48) 

ay =a tra 4 Va 4... , (11.49) 

w = 0) + po + 1720? 4... (11.50) 


The nonlinear equation (11.46) cannot be solved in general and so a perturbation solution is 
sought in terms of power series in the small quantity À. Inserting (11.47), (11.48) and (11.49) 
into (11.46), the recursion formulae for values of ao, a1, a2, ... are found by equating the 
constant terms and those separately in cos wt, cos 2wt, cos 3wt, ... to zero. Ignoring terms 
in A, it is straightforward to repeat Heisenberg’s calculation and find the following recursion 
formulae for the first few terms of the series for the classical nonlinear oscillator: 


[constant] A lo ao(n) + 5a;(n)} =0; 


[cos wf] {- + w} =0; 
[cos2wt] A {[—(2w)* + olan) + Zaz (n)} = 0; 
[cos3wt] A? {[—(Bw)? + wp] a3(n) + ai(n)ar(n)} = 0. (11.51) 


Heisenberg now makes the transition from classical to quantum mechanics. When x is 
replaced by the x(n, n — T) exp[iw(n, n — T)], the first two terms of (11.46) become 


[—w?(n,n — t) + wo]x(n, n — t) explio(n,n — r)]. (11.52) 


The third nonlinear term in x? has to be replaced by the quantum-theoretical multiplication 
rule (11.26) and so the term becomes 


À X x(n, n — t')x(n — T', t’ — T) exp[ia(n, n — T)]. (11.53) 





Substituting (11.52) and (11.53) into (11.46), we obtain a recursion relation for the transition 
amplitude x(n, n — T), 





[-o(n, n — t) + olx, n= t) +A) x, n- T)x(n-r',r-r)=0. (11.54) 
Once again no general solution can be found and Heisenberg carries out the same pertur- 
bation analysis as for the classical case. He proposes that the translation should take the 
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form: 


Aag(n) <—> hao(n, n); 
aı(n) <> a(n,n — 1)cosa(n,n — 1)t ; 
Aaz(n) <—> Aa(n,n — 2)cosa(n,n — 2)t ; 


%?az(n) <> da(n, n — 3) cosa(n,n — 3)t ; 


Tag (n) > I 1a(n, n — a)cosw(n,n — o)t; 


(11.55) 


In his paper, Heisenberg simply writes out the result of going through exactly the same 
analysis as in the classical case. Let us write down the appropriate translation to quantum 
language of (11.47)-(11.50). (11.47) becomes 


x =Aa(n,n)+a(n,n — 1)cosa(n,n — 1)t + Aa(n, n — 2) cos œ(n, n — 2)t 
+7a(n,n —3)cosa(n,n — 3)t + :-- +4 1a(n, n — a) cosw(n,n — a)t +:-- 
(11.56) 


Once again, as in (11.48)-(11.50), the a coefficients and w are expanded in a power series 
in A so that the equivalent functions are: 


a(n, n) = a(n, n) + ra (n,n) + 7a? (n,n) +--+, (11.57) 
a(n,n —1) =a, n —1)+ Aan, n — 1)+%a®(n,n-1)+---, (11.58) 
oln, n-a) = wn,n - a) + io” (n,n — a) +O? (n,n — a) ++: (11.59) 


Now, the same type of manipulation involved in deriving (11.51) is applied to 
(11.46) and (11.56) to (11.59). In other words, recursion formulae for values of a(n, n), 
a(n,n — 1), a(n,n — 2), ... are found by equating the constant terms and those separately 
incos a(n, n — 1)t, cosw(n, n — 2)t, cosw(n, n — 3)t,... to zero. More details of the anal- 
ysis are included in the paper by Aitchison et al. (2004) in which the recurrence relations for 
the transition amplitudes up to second order in A are given. Exactly as before, the recurrence 
relations to zero order in à, which Heisenberg quotes without proof, are as follows: 


[constant] A {wp a(n, n) + t[a°(n + 1,n) +.a°(n,n — 1)]=0} ; (11.60) 


[cos w(n, n — 1)t] —w*(n,n—1) +a, =0; (11.61) 
[cos@(n,n—2)t]  A{[-w?(n,n — 2) + w]a(n,n — 2) 
+ 4a(n,n— la(n—1,n—2)=0} ; (11.62) 


[cosa(n,n—3)t] X {[-o°(n, n — 3) + wpla(n, n — 3) 
+ $[a(n,n — lan — 1,n — 3)] 
+ 3[a(n,n — 2)a(n —2,n—3)]=0} . (11.63) 
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11.5 The nonlinear oscillator 


Next, we recall that the object of the exercise is to determine the quantity x(n, n — T) 
which is given by (11.26). We therefore need to relate the quantities a(n, n — t) derived 
above to the x(n, — t). In the classical case, xy(n) = x*,(m) since x(t) is real. The 
quantum theoretical analogue of this relation is 


x(n,n—T)=x"*(n—T,n). (11.64) 


This can be appreciated from the fact that the product of the associated exponentials 
explio(n, n — T)] x explio(n — t, n)] = 1 according to the quantum rule for the addition 
of frequencies. In principle, x(n,n — t) could be a complex number, but Heisenberg 
implicitly assumed that, by using the simple cosine expansion that they would the real. 
Consequently, 


x(n,n—T)=x(n—T,n). (11.65) 


We can now compare this with the series expansion (11.56) for x(t), a typical term of 
which can be written as follows: 


0 la(n,n — Tt) cosa(n,n — T)t 
t-1 


= a a(n,n — T) {explio(n, n — T)] + expl-io(n,n —T)]}. (11.66) 





Since Bohr’s frequency condition (11.9) requires w(n, n — tT) = —w(n — t, n) and x(n, n — 
T) = x(n — T, n), this expression can be rewritten 


t-l t—l 


a(n, n — T)explio(n,n — t)] + 7 


It is not surprising that the cosine expression corresponds to the sum of two exponentials 
in the series (11.20). Equating the terms in exp[iw(n, n — r)], it follows that 








a(n — T, n)exp[iw(n — t,n)]. (11.67) 








ATT! 
x(n,n—-—T)= 5 a(n,n—T). (11.68) 
Strictly speaking this equation only applies for positive values of t. In general, the result is 
yItI-1 
x(n,n—-T)= 5 a(n,n —T) TH#0. (11.69) 


The solutions of the equations of motion have been found in terms of a(n, n — T) rather 
than x(n, n — T) and so it is convenient to use (11.69) to translate (11.40) into this notation. 
Then,* 

+00 
h=am)[la(n,n+r)Po(n,n+7)—|a(n,n—7)Po(n,n—t)] . (11.70) 
0 

We can now find the solutions for the transition amplitudes, the energies of the stationary 
states and the transition frequencies. We obtain many of the key results from the lowest 
order solutions in which all the X terms are omitted. In this approximation, all the as and 
ws acquire the superscript (0), a and w®. In this limit, (11.61) becomes 


o(n,n—1) =a, (11.71) 
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for all n. This makes sense since, if the perturbation term in A is zero, the nonlinear 
oscillator becomes a linear oscillator with angular frequency wọ. Now we use (11.71) to 
replace œ (n, n + 1) and @(n, n — 1) in (11.70). Then, 
h 
—— = fan n + DP - [an,n - DP. (11.72) 
ITM Wo 


By inspection, we see that the solution of this difference equation is 


fa (n,n — DDP = 





(n + constant) . (11.73) 
TMO) 


The value of the constant is found from Heisenberg’s argument that there should be no 
transitions from the ground state to lower states, that is, 


[a®(0, -1P = 0. (11.74) 
Consequently, the constant is zero. The solution for a(n, n — 1) can therefore be written 
a®”(n,n-1)=ßyn where B=(h/nmo)”” . (11.75) 
We now repeat the procedure for (11.60): 
1 
a(n, n) = = {am + 1, n)P + [a® n, n — 1)P} , (11.76) 
wg 
and so using (11.75), 
a(n, n) = ey +1). (11.77) 
405 


Next, we repeat the procedure for (11.62). Then, 


{—[o (n,n — 2)P + a} a(n, n — 2) + Sa, n- Ia (n — 1,n —2)=0. 
(11.78) 
The frequency w(n, n — 2) must obey the frequency composition rule 
wo (n,n — 2) = w (n,n — 1) + 0(n —1,n—2), (11.79) 


and since, a(n, n — 1) = w for all n, o(n, n — 2) = 2a. In fact, it is obvious from 
the frequency combination rule that repeated application gives the result 

o(n,n—T)=Ta. (11.80) 
This result again makes a lot of sense. The harmonics of the nonlinear oscillator are simply 
integral multiples of the fundamental frequency wo in the lowest approximation. 


It is straightforward to show, by the same reasoning which led to (11.75) and (11.77), 
that 





B 
a (n,n — 3) = — vnn - l)n- 2), (11.81) 


480, 


and that in general, in this lowest order of approximation, 


7 | 
net ye 11.82 
a“ ’(n,n-r) Re Ge’ ( ) 


where A, is a numerical factor which depends upon T. 
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11.5 The nonlinear oscillator 


The final task is to work out the energies of the stationary states of the oscillator. 
Heisenberg begins with the classical expression for the energy of a nonlinear oscillator. 
Following the usual procedure, we multiply (11.46) by mx and integrate with respect to 
time with the result 


W = imi’ + Imwox? + tamx* : (11.83) 
He then quotes the answers according to classical and quantum physics without explaining 
how they are obtained. Aitchison and his colleagues conjecture entirely plausibly that he 
reasoned as follows. Heisenberg now had a formalism for taking products of transition 
amplitudes, (11.25) and (11.26), which are repeated here for convenience: 


x, n—-t)x(n-T,n-r) explio(n,n — t^t] explio(n — t', n — T)t] 


Tt 


T 


= = x(n, n — t')x(n— Tt, n— 0) e&xplio(n,n — rt]. (11.84) 


2 


Therefore, x“ is represented by 


(x x(n,n-T)xn-tT,n- D) explio(n,n — t)t]. (11.85) 


t 


Correspondingly, x? can be replaced by the product 





X io(n,n — t')x(n,n — Tio(n — t', n— t)x(n— Tt’, n— T) 
z 


x &xplio(n,n — t’)t] explio(n — t', n — T)t] 





= > w(n,n—t')o(n — T, n — T)x(n,n- tT)x(n-rt,n-r) expfia(n, n — r)t], 


(11.86) 


where we have used the relation w(n, m) = —w(m, n), according to Bohr’s frequency rela- 
tion. Thus, the expression for the energy of the nonlinear oscillator is of the form 


W = W(n,n - t)explio(n,n — t)t]) . (11.87) 


Heisenberg fully appreciated the fact that energy will only be conserved if W is time- 
independent, meaning that the terms in which t # 0 must be zero, 


W(n.n—t)=0, if t0. (11.88) 


This provided a key test of the validity of the new quantum formalism. 

Let us first evaluate W given the requirement (11.88). We are interested in the lowest order 
solutions and so we can neglect terms in A and its higher powers. Therefore, we can neglect 
the term in tAmx3 in the expression for the total energy. Likewise, in the expansion (11.56), 
the only terms which do not depend upon A are those in a(n, n — 1)cosw(n, n — 1)t, in 
other words, transitions in which n changes only by one. Therefore, the only terms which 
survive in (11.87) are those in W(n,n), W(n,n — 2) and W(n,n + 2). In fact, Aitchison 
and his colleagues show that W(n,n — 2) and W(n, n + 2) both sum to zero and so the 
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only surviving terms are those in W(n, n). Therefore, we need only evaluate W(n, n) for 


which t = 0 and so 
8d 41 .,22 
W = zmx* + Mox 


= im Yon, n—T)o(n, n—t')x(n,n—t')x(n—T', n) 
a 


+ Smo X xon, n— rt) x(n- Tt, n). (11.89) 





To find the total energy, we sum over the only surviving terms in t’, t’ = +1. The ws can 
now be awarded superscripts (0) since all terms in X and its higher orders are omitted in the 
lowest order. Therefore, 


W = sm {[o(n, n—1)o(n, n — 1)x(n,n — 1)x(n — 1, n)] 

+ [o (n,n + Yo(n, n+ 1x(n,n + 1)x(n + 1, n)]} 

+ imo {x(n,n — l)x(n — 1, n)+x(n,n + l)x(n + 1, n). (11.90) 
From (11.65), x(n,n + 1) = x(n + 1, n) and we can now replace the xs by the as 
according to (11.69). Also, because the transitions are all equally spaced in frequency, 
o®(n, n — 1) = o (n,n + 1) = wo. Therefore, 

W = imo% {flan — DP + laO, n+ DP} 

+ imos {1an n — DP + Haa, n+ VP} . (11.91) 

Finally, we insert the solutions for the as from (11.75) which become 


a"(n,n—1)=BYn and a(n+1,n)=BVn+1 where B =(h/amay)'” . 
(11.92) 
The final result is 
hag 1 
War (n+) - (11.93) 
Heisenberg immediately notes that this expression for the energy of the harmonic oscillator 
differs from the ‘classical’ result W = nhwo/2rr because of the inclusion of what is now 
referred to as the ‘zero point energy’ Sho /20. 
The above analysis is only to zero-order in the small parameter X. Heisenberg does not 
proceed to evaluate the higher order corrections of the nonlinear oscillator (11.46), but 
rather treats the simpler example 


E+ asx tax? =0, (11.94) 
the reason being that solutions only contain ‘odd’ terms: 
x = a, cos wt + Aaz cos 3wt + A?as cos 5wt +++: (11.95) 
Preserving terms up to A”, he finds the result 
3(n?+n+3) o, BB 


hwo ( 
3202@am 512r°’w,m? 


W = 
2n 


17n? + Sn’ + In + a) A 
(11.96) 


(a+) +à 
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11.6 The simple rotator 


He also makes the important parenthetical remark, 


‘(I could not prove in general that all periodic terms actually vanish, but this was the case 
for all terms evaluated)’ 


but gives no details about which terms he tested. As discussed in the context of (11.87) and 
(11.88), a key requirement of the theory is to ensure that energy is conserved. Aitchison 
and his colleagues (2004) demonstrate in Appendix B of their paper how these calculations 
can be performed using Heisenberg’s approach and demonstrate that up to order A the terms 
W(n,n — a) are indeed zero, if a # 0. They also carry out the equivalent analysis which 
led to (11.96) for the case analysed by Heisenberg with the nonlinear term Ax? and find the 
result to order A, 

57h? 


hao 
= Oe a 


Ba n’+n+4). (11.97) 


11.6 The simple rotator 
ÁÁÁ 


The final part of Sect. 3 of Heisenberg’s paper concerns the case of an electron in a circular 
orbit of radius a about the nucleus. This turns out to be a much simpler calculation. 
The classical quantum condition is $ p dq = nh and in this case p = mva, the angular 
momentum, and dg = dé: 


nh = mva dô = f moa de. (11.98) 
Differentiating with respect to n 
d 
h = —(2rma’o). (11.99) 
dn 


Using Kramer’s and Heisenberg’s version of Born’s correspondence rule, this expression 
translates into 


h = 2nm[a’w(n + 1,2) — a’? o(n,n — 1)]. (11.100) 
Just as in the case of the nonlinear oscillator, the solution by inspection is 


ia Ne h(n + constant) l 





11.101 
27r ma? ( ) 


Again, since there should be no amplitude for transitions from the ground state to lower 
states, the constant is zero and so 


hn 


er 


(11.102) 
The energy ofthe electron is W = smu and so, using the same procedures as in Sect. 11.5, 
the energy is 

m ,0°(n,n—-1)+a°(n+1,n) | h? 


W = = 
2° 2 Sma 





(n +n+4). (11.103) 
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As Heisenberg points out, this expression again satisfies the condition w(n,n — 1) = 
(27/ h)[W(n) — W(n — 1)]. This result can also be written in the suggestive form 


2 
W= zen + +4] À (11.104) 

where J = ma? is the moment of inertia of the electron in its orbit about the nucleus.’ 

The importance of this calculation for Heisenberg was that the expression (11.104) was 
in excellent agreement with the band spectra measured by Adolf Kratzer (1922). Kratzer 
had made a detailed analysis of the cyanide spectroscopic bands and found that he had to 
introduce half-integral quantum numbers to account for details of these rotational spectra. 
Heisenberg fully appreciated that his new quantum formalism automatically resulted in 
half-integral quantisation. 

The last paragraph of Heisenberg’s paper reads as follows: 


“Whether a method to determine quantum-theoretical data using relations between observ- 
able quantities, such as that proposed here, can be regarded as satisfactory in principle, or 
whether this method after all represents far too rough an approach to the physical problem 
of constructing a theoretical quantum mechanics, an obviously very involved problem at 
the moment, can be decided only by a more intensive mathematical investigation of the 
method which has been very superficially employed here.’ 


11.7 Reflections 


Heisenberg’s caution in his concluding paragraph contrasts dramatically with its revolu- 
tionary content. As we will see, the theory was rapidly taken up by the leading theorists and 
converted into much more accessible form. There can be no doubt that Born fully appre- 
ciated the importance of what Heisenberg had achieved and made the correct decision to 
forward it to the Zeitschrift für Physik for immediate publication as soon as he had studied 
it in detail. 

Heisenberg was not happy with the paper, which contrasted with his previous papers 
in which well-posed problems were solved using established mathematical procedures. In 
his discussions with Pauli, he considered the ‘positive’ part of the paper to be ‘poor’. As 
remarked by Mehra and Rechenberg (1982b) 


‘Now, instead of a complete formulation of a consistent atomic theory, the future quantum 
mechanics, Heisenberg felt that he was able to present only a step in formulating its 
foundations, but by no means the final solution. ... He had just been able to write down 
a few formal equations, which were difficult to handle mathematically and even more 
difficult to interpret physically.’ 


What is staggering about the paper is how many ideas were to prove to be absolutely 
correct. Let us review these achievements. 
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11.7 Reflections 


1. By far the most significant and original idea was the concept that what had gone wrong 
with the old quantum theory was its kinematical content. 

2. This concept was coupled with the innovation that the quantum theory had to be written 
in terms of the properties of atomic systems which were measurable. 

3. This led to the concept, modelled on classical electromagnetism, that the spatial co- 
ordinates were to be replaced by ‘transition amplitudes’, the squares of which would 
determine the probability of such transitions taking place. 

4. Guided by the correspondence principle, the product law for the transition amplitudes 
was found to be non-commutative. 

5. In application to the specific problems which could be treated with his new formalism, 
the quantum expression for the energy levels of an oscillator and a simple rotator had a 
‘zero-point energy’ of hv/2. 

6. In contrast to the kinematics, which was changed radically, the dynamics of particles 
and oscillators remained ‘Newtonian’. 

7. As pointed out by Aitchison and his colleagues (2004), many of the results derived by 
Heisenberg have their exact analogues in the completed version of quantum mechanics. 
For example, (11.54) illustrates a basic rule of quantum mechanics that a transition 
amplitude for a transition from one state to another is found by summing over all 
possible intermediate states. 


Despite its problems, Heisenberg’s paper is undoubtedly one of the great papers in 
theoretical physics. Almost immediately, it opened up routes to more powerful mathematical 
approaches. Heisenberg could only have achieved so much under the joint influences of the 
intuitive approach of Bohr and the more mathematical approaches of Born and Sommerfeld. 
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12.1 Born’s reaction 


224 


In his reminiscences, Born recounted his memories of these exciting days (Born, 1978): 


‘Meanwhile Heisenberg pursued some work of his own, keeping its idea and purpose 
somewhat dark and mysterious. Towards the end of the summer semester, in the first days 
of July 1925, he came to me with a manuscript and asked me to read it and decide whether 
it was worth publishing . . . He added that though he had tried hard, he could not make any 
progress beyond the simple considerations contained in his paper, and he asked me to try 
myself, which I promised... 

His most audacious step consists in the suggestion of introducing the transition ampli- 
tudes of the coordinates g and momenta p in the formulae of mechanics... 

I was most impressed by Heisenberg’s considerations, which were a great step forward 
in the programme which we had pursued... 

After having sent Heisenberg’s paper to Zeitschrift fiir Physik for publication, I began to 
ponder about his symbolic multiplication, and was soon so involved in it that I thought the 
whole day and could hardly sleep at night. For there was something fundamental behind 
it... And one morning... I suddenly saw the light: Heisenberg’s symbolic multiplication 
was nothing but matrix calculus, well known to me since my student days from the lectures 
of Rosanes at Breslau.’ 


At that time, matrices and matrix algebra were regarded as the province of the math- 
ematicians and there were few examples of their application in physics. Heisenberg was 
certainly unaware of matrices, as he himself confessed. In fact, Born was one of the few 
physicists who had used matrices in his earlier researches with von Karman on the theory of 
crystal lattices (Born and von Karman, 1912). He had therefore been familiar with matrix 
algebra, with infinite matrices and the techniques of transforming infinite quadratic forms 
to principal axes. Born had moved away from these pursuits, but now he recalled his earlier 
activities. 

What Born recognised was that Heisenberg’s new multiplication rule (11.26) for transition 
amplitudes corresponded to the multiplication rule for matrices, namely, 


x%n,n—t)= Xx, n= t) xmn- T, n—T). (12.1) 

tl 
In addition, it was well known that the multiplication rule for matrices is non-commutative, 
exactly the feature of his scheme which had caused Heisenberg so much concern. What 
Born did was to change slightly Heisenberg’s notation, by setting the transition amplitude 
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12.1 Born’s reaction 


q(n,n+r)=g(n,m) and then regarding g(n, m) as the elements of a matrix q. Then, 
when the matrix product pq of two matrices p and q is taken, it is not necessarily equal to 
qp. 

The calculation which Born carried out can be appreciated from Heisenberg’s product 
rule (11.28) for the product of two transition amplitudes x(n, n — t) and y(n, n — T): 


x(n,n — t) y(n, n — t) een = X x(n, n-T)y(n-T,n- t)r- | (12.2) 


T 


Born immediately made the significant step of defining a quantised ‘momentum amplitude’ 
through the definition 


p=Pp(n,n—-t)=mi(n,n-—-r), (12.3) 


by analogy with Heisenberg’s transition amplitudes. Thus, if x = x(n,n — t) = 
x(n, n — T) 20r- then 


p=mx(n,n-—TtT)=m Sox, n — t)io(n, n — T) eOr , (12.4) 


T 


In so doing, Born had immediately introduced the pair of canonically conjugate space and 
momentum variables in their quantum guise, p and q. Therefore, using the rule (12.2), the 
product px is given by 


px=m x, n= tion, n — ltr y(n — t'n — Titan (125) 


Tt 


=m X x(n, n — thio(n,n — t')x(n — t, n — t) eOr , (12.6) 


T 


where }_, means ‘take the sum over all integral values of t’ from —oo to +00’. Now, let 
us evaluate this sum for the case t = 0, corresponding to the time-invariant value of the 
product of p and q and to the diagonal elements of the matrix product. Then, 


px =m Y,xa,n-T)io(n,n- T)x(n-T,n), (12.7) 
and so, 
px —xp=m X x(n, n—t’')iw(n,n—t')x(n — Tt, n) 
ee n—t')iw(n — t',n)x(n—T',n). (12.8) 
We now use the rules w(n, n — T’) = -w(n — t’,n) from Bohr’s frequency condition and 


x(n,n — t') = x*(n,n — t’) because the x should be real. Let us first sum over all values 
of t’ > 0. Then, (12.8) becomes 


px —xp =2mi) w(n,n-T)|x(n-r,n)l. (12.9) 


t’>0 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:01 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.013 
Cambridge Books Online © Cambridge University Press, 2014 





226 


Matrix mechanics 


We obtain a similar result if we sum over all values of t’ < 0, but let us now write 


T’ = —t'(+), where the r’(+) are positive integers. Then, 
px —xp= 2miy > o(n,n —t')|x(n — t’, n)? 
t’<0 
= 5 œn, n + T'(+)) |x(n + tT'(+), n)|? ; (12.10) 
t'(+)>0 


Combining (12.9) and (12.10), we evidently obtain the result 


px — xp = 2mi = [o(n,n — t) |x(n - t’, n)? -o(n+T',n)|x(n+r,n)l?]. 
t’>0 


(12.11) 
This expression can now be compared with Heisenberg’s quantum relation (11.40), 


+00 
h=4nm » [Ix(n,n + T) oln, n +t) — x(n, n = t)o(n,n-r)].. (12.12) 
0 


It immediately follows that 


h 
px -—xp=—. (12.13) 
201 


Born appreciated the deep significance of this result — the non-commutativity of the prod- 
uct px — xp was directly related to quantum processes at the atomic level through the 
appearance of Planck’s constant in (12.13). 

He was well aware of the fact that the rule (12.13) only applied to the diagonal elements 
of the matrix product, but he guessed that all the off-diagonal terms had to be zero. Then, 
the non-commutativity rule could be written 


h 
px — xp = —l, (12.14) 
271 


where I is the unit matrix, but he was unable prove this conjecture. He needed help. 


12.2 Born and Jordan’s matrix mechanics 
a EEE | 


Born’s first reaction was to ask Pauli to collaborate with him in developing the mathematical 
physics of matrix algebra as applied to quantum mechanics, but Pauli was dismissive of 
the suggestion — he did not believe in non-commutativity and preferred to let Heisenberg 
continue the line of research he had pioneered. He was also opposed to Born’s formal 
mathematical approach to quantum physics. He told Born 


“Yes, I know, you are fond of tedious and complicated formalisms. You will spoil Heisen- 
berg’s physical ideas by your futile mathematics.’ 


Instead, Born turned to Pascual Jordan, another remarkably talented product of the 
Göttingen school of mathematical physics. Jordan had attended Courant’s courses on 
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mathematics and had assisted him in the preparation of his classic text with Hilbert The 
Methods of Mathematical Physics (Courant and Hilbert, 1924). As a result of this work, he 
had some familiarity with the theory of matrices, although he did not claim any particular 
speciality in the field. He had already helped Born in the preparation of his review of 
the dynamics of crystal lattices (Born, 1923) and written a paper with Born in which 
van Vleck’s theory of dispersion was extended to aperiodic motions (Born and Jordan, 
1925b). He immediately took up Born’s challenge and within a few days proved Born’s 
conjecture concerning the non-commutativity of (12.14) using matrix methods. Born and 
Jordan now took up the challenge of attempting to work out a fully self-consistent system 
of quantum mechanics using matrix methods. Born was exhausted and suffered a minor 
breakdown during the succeeding month and so much of the analysis was carried out by 
Jordan alone. In a remarkably short space of time, they made huge progress towards the first 
complete exposition of quantum mechanics using matrix methods. Their important paper 
was submitted only 60 days after Heisenberg’s (Born and Jordan, 1925b). 
Their agenda is clearly set out in the introduction to their paper 


“,.. It is in fact possible, starting with the basic premises given by Heisenberg, to build 
up a closed theory of quantum mechanics which displays strikingly close analogies with 
classical mechanics, but at the same time preserves characteristic features of quantum 
phenomena.’ 


Born’s objective was to rewrite all the equations of classical physics in matrix notation 
so that the key concept of non-commutativity would automatically be incorporated in the 
new quantum mechanics. Let us review the remarkable contents of Born and Jordan’s 


paper. 
12.2.1 Matrix algebra 


Matrices and matrix algebra had been highly developed by the mathematicians but had, as 
yet, found few applications in mathematical physics. They appear in Chapter 1 of Courant 
and Hilbert’s The Methods of Mathematical Physics and in Böcher’s Algebra (Bocher, 
1911), which were consulted by Jordan.' In Chap. 1 of Born and Jordan’s paper, the 
elementary operations of matrix calculations were reviewed. We recall only the essential 
features of these calculations, emphasising how Born and Jordan had to develop the standard 
procedures. 

First, they used a slightly different notation from the usual use of suffices to identify the 
matrix elements, the objective being to make the notation similar to that used by Heisenberg, 
so that, as noted above x(n, n — tT) = x(n, m). We will adopt the following notation: 


a(00) a(01) a(02) 
a(10) all) alı2) 


a = [a(nm)] = a(20) a(21) a(22) 


(12.15) 


Then the standard texts show how many of the operations are similar to those of ordinary 
algebra, an important exception being the rule for matrix multiplication. Thus, 
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The equality of two matrices: 


a=b means a(nm) = b(nm). (12.16) 
Matrix addition: 
a=b+c means a(nm) = b(nm) + c(nm). (12.17) 
Matrix multiplication: 
k=00 
a=bc means a(nm)= J` b(nk)c(km). (12.18) 
k=0 


This is the rule Born recognised in Heisenberg’s new multiplication rule for transition 
amplitudes. Powers are defined by repeated matrix multiplication. 

The associative rule for matrix multiplication and the distributive rule for combined 
matrix multiplication and addition are 


(ab)c = a(bc) ; (12.19) 
ab+c)=ab+ac. (12.20) 

Non-commutativity of matrix multiplication means that, in general, 
ab ba. (12.21) 


If it turns out that ab = ba, the matrices a and b are said to commute. 
The unit matrix I is defined by 


I = [ö(n, m)] Ba _ H va Z u (12.22) 
It follows that 
al=la=a. (12.23) 
The reciprocal matrix is defined by 
ala=aa'=1, (12.24) 


provided the determinant of a is not zero. As shown in the standard textbooks, the 
elements of a~! are found by forming the adjoint matrix (adj a), the elements of which 
are the transposed cofactors of a(nm). Then, a(adj a) = (adj a)a = jall. 

Differentiation with respect to a parameter t. If the elements of the matrices a and b are 
functions of a parameter ¢, then differentiation of an element of the product ab is given 
by 


£ ) a(nk)b(km) = ) {a(nk) b(km) + a(nk) b(km)} . (12.25) 
k k 


This can be rewritten 


< (ab) = ab+ab. (12.26) 
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Note the importance of the order of the differentiations because of the non-commutativity 
of the matrix product. 
e Repeated application of (12.26) results in the following rule 


d : ; ; 
gere oe Ky) = X1 X2 . ..Xn + X1X2 X bee + XM... Xn. (12.27) 
e Finally, we can define functions of matrices, for example, 


fi (yi, -Ym X1, Xn) = 0 > 
ng (12.28) 
f,yı; ++ Ym>X1> X) = 0. 


These results were all in the literature and they were quickly used by Jordan to demonstrate 
the correctness of Born’s conjecture (12.14). For the case of the one-dimensional nonlinear 
harmonic oscillator, Jordan rewrote the equation of motion as follows: 


ä+ oq +q" =0, n=2,3... (12.29) 


Multiplying (12.29) by mq, first from the right, then from the left and subtracting these 
matrix relations, Jordan found the result 





m(qq—qq)=0, (12.30) 
where 0 is the infinite zero matrix. We now use (12.26) to write 
d(qq) _ „94 
=m— = 12.31 
dt mpat mag = mäq + mq (12.31) 
Similarly, 
d(ag 
mi = mq +maqq. (12.32) 


Substituting (12.31) and (12.32) into (12.30), we find 


d. : 
m (qq — qq) =0. (12.33) 


Finally, in the spirit of Born’s extension of the concept of transition amplitudes to momenta, 
we write p = md, and so 


d 
7 (Pa- ap) = 0. (12.34) 


Now, the off-diagonal terms of the product matrices pq and qp all contain exponential 
time-varying terms of the form exp[(27i/h) x (En — Em)t] where E,, and E,, are the 
energies of the stationary states n and m and so the only way in which (12.34) can be 
satisfied is if the off-diagonal terms are all zero. This result, and the speed with which 
Jordan obtained it, greatly pleased Born. They agreed to develop the full theory of matrix 
mechanics collaboratively. 

As remarked by Jammer (1989), 


‘The equations of motion have shown that (pq — qp) is a diagonal matrix. That all 
diagonal elements are equal to h/2zi has been a consequence of the correspondence 
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principle. Moreover, since [(12.14)], as Born soon realised, is the only fundamental 
equation in which h appears, the introduction of Planck’s constant into quantum mechanics 
has likewise been a consequence of the correspondence principle.’ 


Reflecting on these calculations, Born (1978) later remarked, 


‘I shall never forget the thrill I experienced when I succeeded in condensing Heisenberg’s 
ideas on quantum conditions into the mysterious equation (pq — qp) = (h/2ri)l. 


12.2.2 Matrix dynamics 


With these new insights, Born and Jordan set about rewriting the laws of dynamics in matrix 
form. While the results obtained above could be obtained from the standard textbooks, the 
next steps involved the development of a formalism corresponding to the differentiation 
of one matrix with respect to another. Again, care has to be taken since the variables do 
not commute under matrix multiplication. Therefore, each matrix product has to be written 
out and the correct order of differentiation preserved. For example, if y, X1, X2 and x3 are 
matrices and if, say, 


y =X7X)x1x3, then z = X1X2X1X3 + X2X1X3X1 + X3XÎX2 , (12.35) 
where we have used the permutation rule that X1X1X2X1X3 = X1X2X1X3X] = X1X3X?X2 to 
bring each x; to the front of the queue for differentiation. These and further rules for matrix 
manipulation completed Sect. 2 of Chap. 1 of Born and Jordan’s paper. 

Chapter 2 introduces the laws of dynamics in matrix form. Immediately, they state that 
the dynamical system is to be defined by a spatial coordinate q and momentum p according 
to the definitions 


q = [q (nm) P e], p = [p(nm) i r] . (12.36) 


Whereas Heisenberg had developed his arguments starting from Newton’s laws of motion, 
Born and Jordan replaced these by the more powerful methods of Hamiltonian dynamics 
which had considerable success in the old quantum theory. Like Heisenberg, they only 
considered systems of one degree of freedom, one of their intentions being to demonstrate 
that this more rigorous formal approach could reproduce Heisenberg’s results and so create 
a formally self-consistent theory of quantum mechanics in which the dynamical entities 
were matrices. 

Immediately, they noted that, in general, the g(nm) and p(nm) are complex numbers. 
As a result, these matrices must be Hermitian matrices in which, on transposition, each 
element goes over into its complex conjugate. As a result, it follows that 


q(nm)q(mn) = lg(nm)|? and v(nm) = —v(mn). (12.37) 


Then, in the case of Cartesian coordinates, they identified |g(nm)|? as a measure of the 
probability of the transitions n > m. 

Next, they translated the Hamiltonian equations of classical mechanics, which we dis- 
cussed in Sect. 5.4.3, into matrix form. The Hamiltonian (5.64) can be written in matrix 
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form as 
1 
H= —p’+U(q). (12.38) 
2m 
Then, the equations of motion can be written in canonical form 
ae dH 
= 
12.39 
: OH ( ) 
p= 3q 


They then showed that this formulation leads back to Heisenberg’s quantisation rule (11.40) 
and to the Thomas—Kuhn expression (10.34). 
Having set out the basic elements of matrix mechanics, Born and Jordan next state, 


‘The content of the previous paragraphs furnishes the basic rules of the new quantum 
mechanics in their entirely. All other laws of quantum mechanics, whose general validity 
is to be verified, must be derivable from these basic tenets. As instances of such laws to 
be proved, the law of conservation of energy and the Bohr frequency condition primarily 
enter into consideration.’ 


The law of conservation of energy requires that H = 0 and consequently, as we have shown, 
the Hamiltonian matrix H must be diagonal. Then, Heisenberg identified the diagonal 
elements H(nn) as the energies of the stationary states with quantum number n and so 
Bohr’s frequency condition followed immediately: 


hv(nm) = H(nn) — H(mm), (12.40) 
and the energy of the nth state is 
W, = H(nn) + constant . (12.41) 


Born and Jordan show formally how this comes about. 

Chapter 3 of their paper concerns the application of their new approach to harmonic and 
anharmonic oscillators, the intention being to derive Heisenberg’s result with the new matrix 
formalism. Suffice to say that they succeeded, but without the need to introduce some of the 
assumptions made by Heisenberg. For example, Heisenberg had implicitly assumed that 
transitions only occurred between neighbouring states, whereas the new formalism required 
the quantum transitions to correspond to changes of quantum number An = +1. The final 
Chap. 4 of their paper concerned an early application of these concepts to electromagnetic 
radiation. A significant feature of this last section was the demonstration that the squares 
of the amplitudes of the matrix elements representing the dipole moment of the transition 
determine the transition probabilities. 

As already noted, Born was recovering from exhaustion during the elaboration of these 
concepts and, whilst he was the instigator of the programme and introduced the concepts 
of the matrix multiplication rule and the basic quantum relation (12.14), the detailed 
elaboration of the rules of matrix mechanics and dynamics was largely Jordan’s work. Once 
the paper was submitted, the collaboration with Heisenberg resumed and resulted in the 
definitive paper describing the full theory of matrix mechanics by Born, Heisenberg and 
Jordan (1926). 
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12.3 Born, Heisenberg and Jordan (1926) - the Three-Man Paper 


It is remarkable that the ‘Three-Man Paper’ ended up being as coherent as it did since 
there was actually remarkably little face-to-face contact between the three authors. After 
the completion of his revolutionary paper of 1925, Heisenberg spent nearly a month and a 
half in Munich and a walking tour in the Alps. Then, rather than return to Göttingen, he 
went back to Copenhagen and only returned to Göttingen in late October. Born returned to 
Göttingen following his period of recuperation and then departed for his trip as a ‘foreign 
lecturer’ to the Massachusetts Institute of Technology on 31 August 1926. Thus, much of 
the work on the paper, which was received for publication by the Zeitschrift fiir Physik 
on 15 November 1926, was carried out by correspondence. Equally remarkable is the fact 
that this long paper On quantum mechanics II was a remarkably complete exposition of 
the theory of matrix mechanics as applied to quantum physics and dealt with many of the 
incompletenesses of Heisenberg’s and Born and Jordan’s earlier papers (Born et al., 1926). 

The fact that the theory was developed so rapidly can be attributed to a number of factors. 
Although Heisenberg confessed that he had not appreciated that non-commutativity is a 
key feature of the mathematics of matrices, as soon as he heard of Born and Jordan’s 
successes, he was an instant convert and quickly learned and applied these tools to the 
various problems he had worked on. Born and Jordan were delighted that he joined their 
collaboration since he brought to the research quite different skills and approaches to those 
of Born and Jordan. The second feature of the collaboration was that all three were experts 
in the classical theory of multiply periodic systems, the perturbation schemes used and the 
theory of canonical transformations. These classical techniques now had to be translated 
into the language of matrix mechanics. The third piece of good fortune was that the 
theory of infinite quadratic forms had been developed by Hilbert, Born’s teacher, 20 years 
earlier. Born not only had a deep understanding of these methods, but also realised that 
the matrix theory of canonical transformations was identical with the theory of principal- 
axes transformations which would turn out to be the route to the calculation of the energy 
levels in atomic systems. A further bonus was that Hilbert’s formalism, as extended by 
Ernst Hellinger, enabled discrete and continuous states of atomic systems to be treated in 
a unified manner. 

Mehra and Rechenberg provide a succinct summary of the achievements of the Three- 
Man Paper: 


‘The endeavours of Born, Heisenberg and Jordan led to the development of the theory 
of matrix mechanics, which was applicable to all types of multiply periodic systems, 
to non-degenerate and degenerate ones, and in principle even to aperiodic systems. In 
addition, the authors realised that the matrix equations had a simpler structure than 
the corresponding classical equations. Thus, for example, the set of Hamilton—Jacobi 
partial differential equations of classical mechanics was replaced by a set of algebraic 
equations; their solution established a unitary transformation of the quantum-mechanical 
Hamiltonian matrix to a diagonal matrix. Now the discussion of conservation laws also 
appeared to be considerably more elementary.’ (Mehra and Rechenberg, 1982c) 
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As discussed by van der Waerden (1967) and Mehra and Rechenberg (1982c), the ideas 
were developed in a flurry of correspondence between the three authors, as well as with 
Pauli and Bohr. Let us give some impression of the contents of the ‘final’ version of matrix 
mechanics. 


12.3.1 Basic theory — systems with one degree of freedom 


The introduction was written by Heisenberg and he immediately summarised the advances 
since his own paper (Heisenberg, 1925) and Born and Jordan’s Paper I (Born and Jordan, 
1925b): 


“The present paper sets out to develop further a general quantum-theoretical mechanics 
whose physical and mathematical basis has been treated in two previous papers by the 
present authors. It was found possible to extend the above theory to systems having several 
degrees of freedom (Chapter 2), and by the introduction of “canonical transformations” 
to reduce the problem of integrating the equations of motion to a known mathematical 
formulation. From this theory of canonical transformations we are able to derive a per- 
turbation theory (Chapter 1, §4) which displays close similarity to classical perturbation 
theory. On the other hand we were able to trace a connection between quantum mechan- 
ics and the highly-developed mathematical theory of quadratic forms of infinitely many 
variables (Chapter 3).’ 


But, they recognised that the new quantum mechanics is not visualisable as was the old 
quantum theory: 


‘Admittedly, such a system of quantum-theoretical relations between observable quanti- 
ties, when compared with the quantum theory employed hitherto, would labour under the 
disadvantage of not being directly amenable to a geometrically visualizable interpretation, 
since the motions of the electrons cannot be described in terms of the familiar concepts 
of space and time.’ 


Despite this, 


‘If one reviews the fundamental differences between classical and quantum theory, dif- 
ferences which stem from the basic quantum theoretical postulates, then the formalism 
proposed..., if proved to be correct, would appear to represent a system of quantum 
mechanics as close to that of classical theory as could reasonably be hoped.’ 


Chapter | begins by setting out the basic formalism of matrix calculus for systems having 
one degree of freedom and re-introducing Born’s equation (12.14), 


h 
px — xp = —I. (12.42) 
271 
They remark on the fact that this 


“,.. is the only one of the basic formulae in the quantum mechanics proposed here which 
contains Planck’s constant h. It is satisfying that the constant already enters into the 
basic tenets of the theory at this stage in so simple a form.’ 
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This basic relation leads to a more general expression. They show that, if f(pq) is any 
function of p and q, then 


af h 
fq — qf = ——., 12.43 
q-q TO ( ) 
af h 
fee ——. 12.44 
pf — fp da ri ( ) 


They demonstrate this by showing that if the formulae (12.43) and (12.44) are valid for 
each of the functions g and %, then they must also be valid for the combinations 9 + % 
and 9 - y. The first of these is trivial, while the second follows from a simple calculation. 
Setting f = g - Y, (12.43) becomes 
(9: Yq -q9 : Y) = o(¥q— ay) + pq — qe) 
a ð h a(Q - h 
fg Me aaa 
dp op 271 op 2ri 
A similar treatment follows for p(p - %) — (g- W)p in (12.44). Therefore, since (12.43) 
and (12.44) hold for p and q they must also be valid for every function f which can be 
expressed as a power series in p and q. 
Next, the canonical equations (12.39) are introduced and energy conservation and Bohr’s 
frequency condition derived from the basic postulates. Central to the development is the 
frequency combination principle according to which 


v(nm) + v(mk) = v(nk) , (12.46) 





which leads to the expression 

Wn Er Wm) 
h ’ 

where the W,„s are the energy levels or ‘energy terms’. These can be converted into a 

diagonal matrix W according to the definition 


v(nm) = 


(12.47) 


W, forn =m, 
Oforn Am. 
Now, for any quantum theoretical matrix a, the elements of the matrix take the form 
a = [a(nm) ever], (12.49) 
and so the time derivative of a is 
a = [271 v(nm)a(nm)] , (12.50) 


where the exponential time dependence has been dropped since it cancels through all the 
formulae. We can now use (12.48) to write the energy terms as follows: 


Wa = bp ortam) = z Snk Waem | = [W, a(nm)], (12.51) 
k k 


aW = £ a(nk) mim) = p a(nk)ôkm wa = [Wm a(nm)] i (12.52) 


k k 
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Inserting these relations into (12.47) and (12.50), we find the general result for any quantum 
theoretical quantity a, 
. 220i 
a= z (Wa — aW). (12.53) 
Next, we need to show that this formalism is consistent with Hamilton’s equations in 
matrix form. In (12.53), we set successively a = q and a = p and substitute for q and P 
in (12.39). Then, we use (12.43) and (12.44) with f = H to determine dH/dq and dH/dp. 
The result is the pair of equations 
Wqa-qW=Hq-qH ; Wp-PW=Hp-PH, (12.54) 
which can be rewritten 
(W—H)q—q(W—H)=0 ; (W-H)p-p(W-H)=0. (12.55) 


Thus, (W — H) commutes with p and q and hence with every other function of (p, q). In 
particular, it commutes with the energy function, or Hamiltonian, H, 


(W — H)H — H(W—H)=0. (12.56) 

Hence, from (12.53), we find 
H=0. (12.57) 
This result proves that energy is conserved and that H is a diagonal matrix, H(nm) = ônm Hn. 


Consequently, from the first expression in (12.54), 


(Hn ga Hm) 
h 


Now, Born and his colleagues invert the argument with important consequences for 
the further development of the theory. If now they assume energy conservation and the 
frequency condition (12.58) and that the energy function H is an analytic function of 
variables P and Q, then provided 


h 
PQ- QP = —l, (12.59) 
271 
the canonical equations 
. OH ; oH 
=— , P=-— 12.60 
= oP dQ ( ) 


always apply. This leads naturally to one of the most important features of their scheme 
of quantum mechanics, the concept of canonical transformations. The above arguments 
suggest that a canonical transformation from variables (P, Q) to (p, q) can be written 


h 
pq — qp = PQ — QP = —I. (12.61) 
271 


Heisenberg and Born both worked on the nature of the transformations from variables 
(P, Q) to (p, q). Born’s knowledge of matrix algebra enabled him to come up with the most 
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general form of transformation which was of the form 
P=Sp!, Q=Sqs"!. (12.62) 


By the same reasoning as above, these transformations must also hold good for sums and 
products of p and q and so more generally, 
f(P, Q) = Sf(p, qJS” . (12.63) 
The importance of this result is most simply explained in the authors’ own words: 
‘The importance of the canonical transformation is due to the following theorem: If any 
pair of values Po, qọ be given which satisfy [(12.60)], then the problem of integrating the 


canonical equations for the energy function H(p, q) can be reduced to the following: A 
function S is to be determined, such that when 


p=SpS', q=Sqs', (12.64) 
the function 
H(p, q) = SH(py, qu)5 = W, (12.65) 


becomes a diagonal matrix. Equation [(12.65)] is the analogue to the Hamilton partial 
differential equation, and in a sense stands for the action function.’ 


This is the key result — the diagonal elements of the matrix W are the stationary energy terms 
of the system. The problem has been reduced to the determination of the transformation of 
H(p, q) to diagonal form and the procedures for doing this were already in the mathematical 
literature. The remainder of Chap. 1 of their paper was devoted to showing how the new 
formalism could deal with perturbation theory and time-variability of the energy function 
H. 


12.3.2 Systems with an arbitrary number of degree 
of freedom and Hermitian forms 


The second chapter deals with the extension of the formalism to several degrees of freedom, 
f > 1 and involves replacing the two-dimensional matrices by 2 f-dimensional matrices 
corresponding to the 2 f -dimensional manifold of stationary states: 


q; = [gulnı ...ng,my...myp)), Pr = [pkn ...ng, mı ...mpf)]. (12.66) 
The equations of motion corresponding to (12.39) can be taken over in the form 
4, = — 3h =-— . (12.67) 
In addition, the commutation relations have to be extended as follows: 


h 
Pq; — GP; = Tri -Ôkl , 
ri 


PıPı - PıP = 0, 
44-494 = 0. 


(12.68) 
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They demonstrated how all the results described in Sect. 12.3.1 can be extended naturally 
to the 2 f-dimensional case. Also in Chap. 2, they showed how the scheme can deal with 
degenerate as well as non-degenerate quantum systems. 

Chapter 3 was written by Born. He had the advantage over Heisenberg and Jordan 
in having in-depth knowledge of the mathematical advances carried out by Hilbert and 
Hellinger which turned out to be exactly the tools needed to complete the formulation of 
matrix mechanics. As Born expressed it, 


‘But behind the formalism of this perturbation theory there lurks a very simple, purely 
algebraic connection.... Apart from the deeper insight into the mathematical structure 
of the theory, we thereby gain the advantage of being able to use the methods and results 
developed earlier in mathematics.’ 


Born’s remark refers to his realisation that the transformation of matrices can be regarded 
as equivalent to a system of linear transformations for bilinear forms. The reason is that, to 
every matrix a = [a(nm)], there corresponds a bilinear form defined by 


A(xy) = Yo a(nm)xnYm , (12.69) 
nm 

of the two series of variables x1, x2,...,X, and y1, y2,..., Yn. One of the features of the 
matrix elements of a is that they should be Hermitian so that the squares of the amplitudes 
are real numbers. In the standard theory, Hermitian matrices are defined such that the 
transposed matrix a* is equal to its complex conjugate a, that is, a = a*. In terms of 
bilinear forms, their Hermitian properties are defined by a(mn) = a*(nm). Then, if we 
write y, = x*, it follows that 


n? 


A(xx*)= X > a(nm) xn x, (12.70) 
is real. 

The importance of these considerations is that the theory of bilinear forms had been 
pioneered by Hilbert and his studies were brought together in his book Grundzüge einer 
allgemeinen Theorie der linearen Integralgleichungen (Hilbert, 1912). In particular, he 
showed that, for a finite number of variables, it is always possible to carry out an orthog- 
onal transformation of a bilinear form to a sum of squares, the procedure known as the 
transformation to principal axes, specifically, 


ATI Me: (12.71) 


Rewriting this expression in terms of matrices, there exists an orthogonal matrix v such 
that 


vv =I and vav*=vav!=wW, (12.72) 


where W = (W,„ö„m) is a diagonal matrix. Born knew that the problem of the diagonalisation 
of Hermitian forms was equivalent to an ‘eigenvalue problem’ and he called the solutions 
for W, the eigenvalues of the set of linear equations 


Wx —  H(kl)x, = 0, (12.73) 
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a typical eigenvalue equation. As van der Waerden remarks, for Born the eigenvalues were 
mathematical tools for determining the energy levels of the atomic system. Only with 
the somewhat different approach of Schrödinger was it appreciated that the eigenvectors 
(x1, X2, ...) determine the properties of the stationary states of the atom. 

Born and his colleagues also appreciated that, in addition to enabling the energy levels of 
the atomic systems to be determined, the formalism of Hilbert and Hellinger also enabled 
continuous spectra to be treated. Continuous spectra are solutions of the same equations 
of motion, but the orthogonality relations now need to be written in terms of differential 
spectra rather than discrete spectral lines. As they wrote in their paper, 


‘The simultaneous appearance of both continuous and line spectra as solutions of the 
same equations of motion and the same commutation relations seemed to us to represent a 
particularly significant feature of the new theory. . . . there nevertheless are characteristic 
distinctions, both mathematically and physically, between continuous and discrete spectra, 
corresponding to the differences between Fourier series and Fourier integrals in classical 
theory.’ 


They illustrate the differences by considering the classical analogues of periodic and ape- 
riodic motions. In the case of multiply periodic systems such as those considered in the 
old quantum theory, a Fourier series a(v) can be associated with oscillations of the form 
exp(2z1v2). In contrast, for aperiodic motions, it is necessary to work in terms of Fourier 
integrals where y(v) dv takes the place of a(v). In the quantum mechanical case, the quan- 
tities q(k/) are replaced by the differential quantities g(k, W)dW or q(W, W')dW dW’, 
depending on whether one or both indices lie in the continuous region, where we denote 
states in the continuum by the Ws. Thus, the quantities to be subject to the orthogonality 
conditions are differential energies rather than total energies. 

It turned out that Ernst Hellinger, an old friend of Born’s, had completed his doctoral 
dissertation at Göttingen in 1907 under Hilbert’s supervision on precisely the topic of 
bilinear forms which could not be represented by discrete quantities g(k/) (Hellinger, 1909). 
In cases in which }_„„ H(mn) xx}; cannot be converted into the expression }_„ W,yny; by 
an orthogonal transformation, it was assumed that a representation including a continuum 
spectrum exists: 


YO Hmn) xmxt = Ý Wayays + 1 Wo)yoy"(@o)dg, (12.74) 


mn 


in which the variables x, are connected to the variables y, and y(p) by an orthogonal 
transformation. Hellinger had shown that the corresponding orthogonality conditions for 
any two interval A, and A, of the continuous spectrum can be written 


2 now yaw’ | x(W") dW" = do(W) = 6(W) -e (W®) , (12.75) 
k Ay A2 


Ai 


where A 1; is the interval common to both A; and Az and W® and W") are the endpoints 
of A12. If there is no overlap between A; and As, the right-hand side of (12.75), ¢(W™) — 
¢(W), is zero. The orthogonal matrix for transformation to principal axes now takes the 
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k— oS WO 
n e e e e e e. e es m e e e e 
| | | | 
(a) (b) 


(a) A schematic representation of the ‘orthogonal matrix’ S showing the discrete values x, and the continuous 
distribution x(W) dW. (b) Matrix representing the discrete and continuum values of the momentum and coordinate 
matrices p and q according to matrix mechanics (Born et al., 1926). 


form 
S = [xXin, x(W) da W] (12.76) 


and is represented schematically in Fig. 12.1a. 
These matrices can then be used to reduce the momentum and coordinate matrices to 
diagonal form by the orthogonal transformations 


p=SpS!, q=SqS", (12.77) 


resulting in four types of elements for p, which are illustrated schematically in the matrix 
shown in Fig. 12.15: 


p(mn) =), xf, D°'(kD)xin , 
p(m, W)dW =y y xp, D'(kD)x(W) dW , (12.78) 
D(W,n) dw = oy x(W) dW - P(Kk1)XxIn A ` 


p(W', W") dW! dW" = Yy x¥ (W) AW pYkDx (W aW" . 


These correspond to four types of transition: top left, from ellipse to ellipse; top right, from 
ellipse to hyperbola; bottom left, from hyperbola to ellipse; bottom right, from hyperbola 
to hyperbola. In the language of atomic physics, these correspond to bound—bound, bound- 
free, free-bound and free-free transitions. 


12.3.3 The quantisation of angular momentum 


The fourth and final chapter of the Three-Man Paper concerned physical applications of 
the theory. The first part ofthe chapter is devoted to the laws of conservation of momentum 
and angular momentum and the determination of selections rules for transitions between 
stationary states. 
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First of all, the matrix expressions for the components M,, M,,, M, of the total angular 
momentum of the system M were written down by analogy with their classical counterparts. 


f/3 


M, = Yi Pin: u Ary Prz) ’ (12.79) 
k=1 


$B 


My =) (Prais — APr) » (12.80) 
k=1 


$13 


M: =) (Pirai — Aix Piy) - (12.81) 
k=1 


Just as in the classical case, in the absence of external torques, these components are 
conserved, but this now follows from the commutation properties ofthe ps and qs. Similarly, 
total linear momentum and the components of the linear momentum are conserved in the 
absence of forces 


JB f/3 
p= 5 p, = constant, p, = > Px = constant, ... (12.82) 
k=1 k=1 


From the commutation relations (12.68), they immediately derived the fundamental quan- 
tum mechanical relation 


h 
M,M, — M,M, = —M.. (12.83) 
: 271 
The total angular momentum M is a diagonal matrix and its square commutes with the 


z-component of the angular momentum: 
M’M. — M.M’ = 0. (12.84) 


The elements of the diagonal total angular momentum matrix M? were shown to be j(j + 
1)(h/2x)° where j is an integer or half-integer. Finally, the selection rules for transitions 
were shown to be 


m—>m+1 or m or m-l, (12.85) 


J>j+l o j o j-l, (12.86) 


where the js and ms are integers or half-integers and m < j. These are precisely the rules 
which had been derived empirically by Landé and his colleagues. The formalism also 
enabled the intensities and polarisations of the lines to be determined. 

After a brief discussion of the Zeeman effect, the paper concludes with an analysis by 
Jordan of the statistics of the wave fields emitted by an ensemble of harmonic oscillators. 
Using the new formalism, he was able to recover Einstein’s expression (3.41) for the 
fluctuations in black-body radiation 


E? 


A?=hvE , 
” PF 





(12.87) 
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12.4 Pauli’s theory of the hydrogen atom 


where z, dv is the number of eigenvibrations, or normal modes, in the frequency interval v 
to v + dv. 


12.3.4 Reflections 


The Three-Man Paper was a quite remarkable achievement. The three authors were entering 
entirely new areas of theoretical physics using new mathematical tools. The extraordinary 
feature of their work is that they successfully worked out all the essential features of 
the full theory of non-relativistic quantum mechanics. But this was achieved at a price — 
the mathematics was known to relatively few physicists and the physical content of the 
theory was not yet fully understood. The authors were well aware of this. In his introduction, 
Heisenberg made every effort to endow the theory with physical meaning, but there was still 
some way to go. Fortunately, they were in continuous correspondence with their theoretical 
colleagues, in particular, with Pauli and Weyl. Once the clues were provided, they rapidly 
produced independent analyses which resulted in the same set of commutation relations. 
Still, the community of physicists at large was wary of the complexity of the new theory 
until Pauli used the new matrix methods to derive the energy levels and selection rules for 
the hydrogen atom. 


12.4 Pauli’s theory of the hydrogen atom 
————————— za” Eee 


Pauli was sceptical about matrix mechanics, still railing against what he perceived to be 
Born’s excessively formal mathematical approach to the problems of quantum physics. He 
felt that the 43-year old Born should leave the problems of quantum physics to the new 
generation of young physicists — in his opinion, the new physics was Knabenphysik, young 
man’s physics. Even by October 1925, he was still opposed to the Göttingen approach, as 
he wrote to Ralph Kronig: 


“One must first seek to liberate Heisenberg’s mechanics from Göttingen’s deluge of formal 
learning and better expose its physical essence.’ 


Heisenberg was not to take this lying down. Having seen Pauli’s letter to Kronig, he 
responded: 


“With respect to both of your last letters I must preach you a sermon, and beg your 
pardon for proceeding in Bavarian: It really is a pigsty that you cannot stop indulging 
in a slanging match. Your eternal reviling of Copenhagen and Göttingen is a shrieking 
scandal.... When you reproach us that we are such big donkeys that we have never 
produced anything new in physics, it may well be true. But then, you are also an equally 
big jackass because you have not accomplished it either...’ 


This provoked Pauli into action, showing the Göttingen physicists that he could beat them 
at their own game. Born, Heisenberg and Jordan were well aware that the ‘crown jewels’ 
would be the demonstration that the new theory of matrix mechanics could account for 
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the Balmer spectrum of the hydrogen atom. Bohr’s theory of the hydrogen atom was now 
regarded, even by Bohr himself, as an ‘accident’ — according to the new perspective, it was 
meaningless to talk about electron orbits since they did not correspond to observables. The 
Three-Man Paper, however much it struck right at the heart of quantum mechanics, did not 
produce the answer, much to their frustration. 

The problem they encountered was that the matrix formalism could not be applied 
directly to the conjugate of the angular momentum matrix M. Classically, the conjugate of 
the action variable J is the angle variable ¢, but there was no matrix corresponding to ¢. 
In the second section of his paper, Pauli explained the problem: 


“... we must first . . . develop the requisite rules for simultaneously operating with matrices 
x, y and z of the Cartesian coordinates of the electron... , the matrix r of the magnitude 
of the radius vector, and their time derivatives. The present version of the laws of the 
new quantum mechanics requires that we avoid the introduction of a polar angle &. Since 
this is not confined within finite limits, it cannot, namely, be formally represented as a 
matrix in the same way as the above-mentioned coordinates, which execute librations? in 
classical mechanics.’ 


Pauli quickly mastered the techniques of matrix mechanics and then discovered an 
ingenious route for circumventing the problems associated with the polar angle @. He had 
just completed a major survey of quantum physics for the Handbuch der Physik and in that 
review he described the approach to the analysis of orbits in an inverse-square law field by 
his colleague at Hamburg, Wilhelm Lenz (Pauli, 1926). Lenz avoided using the angle & by 
introducing another constant of the motion, the vector A, which was defined by 


ÅT €o 





A= 


r 
= M =; 12.88 
Ze ms IM x pl+ i ( ) 


where mo is the mass of the electron, p its linear momentum, M its angular momentum 
vector and r the vector from the focus to a point on the ellipse. The significance of A can 
be appreciated from the following calculation. Taking the scalar product of A with r, 











4 : 
A:r= 5 -[Mxplr+, 
Zemo r 
ÅT €o 
|A||r|cos@ = ~ [p xr]: M + |r], 
Zemo 
Amey |M]? 
=1-|Alcos¢d, (12.89) 
Ze2mo |r| 


since M = r x p. This is exactly the expression for the elliptical orbit of an electron in an 
inverse-square electrostatic field written in pedal, or (r, $), form as in (5.1),° 


Xr 
—=1+ecos¢@ where = —_ >; (12.90) 
r 


and € is the eccentricity of the ellipse. This result provides a physical interpretation of the 
meaning of the vector A. Comparison of (12.89) and (12.90) shows that the magnitude of 
A is the eccentricity £ of the ellipse and that the vector points from one focus of the ellipse 
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along the major axis towards the other focus, as can be appreciated from the geometry of 
Fig. 5.1. 

Next, (12.90) can be written in terms of the energy W of the orbit. According to classical 
dynamics, the total energy of the elliptical orbit under the influence of the inverse-square 
law of electrostatics is 


U Ze? À 
W=T+U=-2 =- h annn 12.91 
u 2 Brea En u 





T and U are the kinetic and electrostatic potential energies respectively and a is the length 
of the semi-major axis of the ellipse. This result was derived in Sect.5.2 as expression 
(5.22). Substituting the value of A from (12.90), 








32n?e2 
(1-8) = -WIM]? Pen , (12.92) 
that is, 
327762 
1—|AP = -WIM} 0 12.93 
|A| |M| Demo ( ) 


The importance of this result for Pauli was the fact that the orbits of the ellipses in the 
old quantum theory were confined to a plane and were characterised by two variables, the 
action—angle variables J and ¢. Now, he had two independent vectors which are constants 
of the motion, M and A, which do not suffer from the problems of the angle variable. In 
addition, (12.93) depended only upon the values of the two constants of the motion and the 
energy of the orbit. Pauli realised that the energies of the stationary states could be found 
by converting this classical formalism into the new matrix mechanics and then the energy 
levels of the hydrogen atom could be determined according to the prescription of Born, 
Heisenberg and Jordan. 

In an extraordinary tour de force, he now set about converting the vectors M and A into 
matrix form, using his own very considerable ability at ‘formal mathematics’ as well as 
ingenuity and insight in making the problem of the hydrogen atom soluble. An outline of 
his calculation is as follows.* 

First of all, the angular momentum matrix was defined as M=™m,(r x v) and was 
shown to commute with the Hamiltonian matrix of the one-electron atom. Hence, the 
angular momentum was a constant of the motion. Next, from (12.88) he defined the matrix 


corresponding to A as 
Aten 1 r 
= -= [M — M] + -. 12.94 
Zama. APER Er ( ) 


Then, he established the commutation relations for M and A 





h 
(MxM)=-_—M, (12.95) 
2ri 
h 
(AxM)=-—A, (12.96) 
2ri 
h 2 
(A x A) = 5— (Z’e'mo) '2H-M, (12.97) 
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where H is the Hamiltonian of the system. The matrix expression corresponding to (12.93) 
became 





2 16r°e? 2 h? 
1-A = 2H | M* + —I). (12.98) 
Mo IT 


The objective was now to find the eigenvalues associated with the solutions of (12.98) but 
Pauli was well aware of the fact that they would be degenerate. The same problem occurred 
in the old quantum theory in which the energies of ellipses with the same semi-major axes 
were all the same. There was, however, a standard procedure for relieving the degeneracy, 
which had already been used by Born and Pauli in a paper of 1922, and that was to perturb 
the Hamiltonian by the addition of a term of the from AH; which would split the degenerate 
energy levels. To achieve the complete separation of the levels, Pauli included two types of 
perturbation in H;, a non-Coulomb radial field and a magnetic field in the z-direction. As 
the small parameter X tends to zero, the solution for the degenerate case is recovered. 

Pauli knew about the rules of quantisation of angular momentum described in Sect. 12.3.3 
and used these to obtain the same result as Born, Heisenberg and Jordan (1926) that the 
eigenvalues of the matrix M? were j(j + 1)(h/2z). The determination of the elements 
of the matrix A remained a challenge which Pauli solved using the rules for the relative 
intensities of the Zeeman components which had previously been derived by Hönl (1925) 
and Goudsmit and Kronig (1925). Finally, Pauli obtained the result that the absolute values 
of the energy levels of the hydrogen atom were 


Z’etmy 
Begh?(Cimax + 1? 
Setting jmax + 1 = n, Bohr’s formula for the energy levels ofthe hydrogen atom (4.25) was 
recovered: 


W, = HÜj,m; j,m) = (12.99) 


_ Zemo 


Sesh?n? 





A (12.100) 
In coming to this spectacular result, Pauli obtained other key features ofthe hydrogen atom 
as well. He showed that the values of j and m had to be integers and that the maximum 
value of the angular momentum quantum number jmax = n — 1, where n = 1 is the ground 
state. Thus, unlike the old quantum theory, the lowest energy state of the hydrogen atom 
has zero angular momentum. This is a pure quantum mechanical result, quite unlike the 
lowest energy state of the Bohr atom. This had the effect of getting rid of the empirical 
restrictions which had to be placed upon the allowed orbits in the old quantum theory, for 
example, the exclusion of linear orbits in which the electron passes through the nucleus 
(Sect. 5.2). It will be recognised that Pauli had derived many of the essential features of the 
quantum mechanical model of the hydrogen atom. 

In addition, the same new quantum conditions resolved the problem of the motion of the 
electron in the hydrogen atom under the influence of crossed electric and magnetic fields. 
According to the old quantum theory, the perturbation of the energy of the state W, in the 
presence of crossed electric and magnetic fields was 


n 


AWong = (5-7) wih + (5 —m) oh, (12.101) 
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where w; = vL + Vs, @ = |v — vsl, vr is the Larmor frequency and v, is the precession 
frequency associated with the Stark effect. Pauli showed that according to the new quantum 
theory, the quantum number n should be replaced by jmax = n — 1, again eliminating linear 
orbits in which the electron passes through the nucleus. In the old quantum theory, these 
orbits had to be excluded arbitrarily since they violated the adiabatic principle. 


12.5 The triumph of matrix mechanics and its incompleteness 
SSS 


Pauli’s achievement in solving the problem of the hydrogen atom according to matrix me- 
chanics was universally applauded as a triumph for the new quantum mechanics. Pauli had 
carried out the calculations described in Sect. 12.4 in only three weeks and communicated 
the results to Heisenberg, who wrote on 3 November 1925, 


‘I need not tell you how delighted I am about the new theory of hydrogen and how 
pleasantly surprised I am about the speed with which you produced this theory.’ 


Bohr learned about Pauli’s new results from Kramers soon after this date and immediately 
wrote to Pauli: 


“To my great joy I heard from Kramers that you have succeeded in deriving the Balmer 
formula.’ 


Pauli responded by sending Bohr and his colleagues details of his calculations which 
impressed them greatly. Bohr wrote to Pauli on 5 December 1925: 


“Kramers, Kronig and I, who have just gone once again with the greatest pleasure through 
your beautiful calculation of the hydrogen spectrum, send you many friendly greetings 
from Tisvelde.’ 


These remarkable calculations convinced most physicists that the new matrix mechanics 
had the power to account for quantum phenomena in a self-consistent fashion. 

The new quantum mechanics was however unfamiliar to the physicist-in-the-street and 
there were relatively few problems which could be successfully addressed by the formalism 
as it stood. The helium atom was still too hard and the spin of the electron still had to be 
incorporated into the scheme of quantum mechanics. 

One other notable success was the matrix mechanics of diatomic molecules which was 
carried out by Lucie Mensing, a graduate student of Wilhelm Lenz in Hamburg. The two 
papers of Born, Heisenberg and Jordan (1925b; 1926) had already considered the cases of 
the nonlinear oscillator and the rotator in quantum mechanics and so she was able to use 
much of the material already in these papers to address the quantisation of the vibrational 
and rotational stationary states of diatomic molecules (Mensing, 1926). She found that the 
energy states of the diatomic molecule were given by the expression 


2 


h 
Eto UTD +h + 3) [vo + B/G + DI 


+ah’ [n(n +1) +f] +- (12.102) 


Waj = Uo + 
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This expression has a similar form to the classical expression for the vibrational and 
rotational frequencies of a diatomic molecule, but with some important differences: 


The second term on the right-hand side represents the rotational states of the molecule 
with the classical j? being replaced by the quantum mechanical j(j + 1). 

The third term corresponds to the classical term hn(vp + j?) and describes the har- 
monic oscillations of the nuclei relative to their centre of mass and the mixed rotational- 
vibrational modes. 

The fourth term describes the anharmonicity of the oscillations and replaces the classical 
term han?. 

Particularly important was the inclusion of the factor (n + 5) in the third term. The term 
one-half corresponds to the zero point energy of the oscillator and its presence resolved 
the discrepancy between the experimental values of the vibrational modes of the molecule 
and the classical theory. To account for this discrepancy half-integral quantum numbers 
had been introduced, but Mensing’s calculation now showed that this was unnecessary. 
The only allowed transitions were associated with Aj = +1 for the rotational terms and 
An = 0, +1, +2,... for the vibrational terms. 

















The successes of matrix mechanics was undeniable, but there was still a long way to 


go before the full richness of quantum phenomena could be encompassed by a fully self- 
consistent theory. Matrix mechanics was only the first step along the route to the modern 
theory of quantum mechanics. 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:01 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.013 
Cambridge Books Online © Cambridge University Press, 2014 





Dirac’s quantum mechanics 





Göttingen and Copenhagen were the undoubted capitals of the new discipline of quantum 
mechanics. The expertise in experimental and mathematical physics and in pure math- 
ematics made Göttingen the epicentre of the revolution which was taking place in the 
mathematical physics of quanta. Whilst this was to remain the case for the next few years, 
other actors soon appeared on the scene who were to contribute to Born’s ‘tangle of inter- 
connected alleys’. What was truly remarkable was how quickly the different approaches to 
the problems of quantum theory were developed and the rapid assimilation of all of them 
into a coherent and self-consistent theory of quantum mechanics. Whilst the theory itself 
was completed relatively quickly, the understanding of its physical content was to take 
many more years. 

The new players on the scene included Paul Dirac at Cambridge, Erwin Schrédinger in 
Vienna and Norbert Wiener at the Massachusetts Institute of Technology. Each of them 
brought quite new approaches to the development of quantum theory — their innovations 
were to supersede the matrix mechanics of Born, Heisenberg and Jordan, but there can 
be no doubt that the success of that theory indicated clearly the route ahead. They were 
however to involve the introduction of new mathematical techniques into the description of 
quantum phenomena. 


13.1 Dirac’s approach to quantum mechanics 


247 


Paul Dirac was trained as an electrical engineer at Bristol University, but he had a very 
strong mathematical bent. He was a solitary character who was notoriously quiet and self- 
effacing. He simply worked things out on his own. He was interested in problems which 
could be treated on a strict mathematical basis and looked for beauty in the mathematics 
needed to describe nature. On the one hand, his training as an engineer had taught him to be 
somewhat pragmatic about the mathematics necessary to address any particular problem. 
As he wrote in his memoirs of 1967, 


‘I think if I had not had this engineering training, I should not have had any success with 
the kind of work I did later on, because it was really necessary to get away from the point 
of view that one should only deal with exact equations, and that one should deal only with 
results which could be deduced logically from known exact laws which one accepted, 
in which one had implicit faith. Engineers were concerned only with getting equations 
which were useful for describing nature. ... 
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And that led me of course to the view that this outlook was really the best outlook to 
have. We wanted a description of Nature. We wanted to find the equations which would 
describe Nature, and the best we could hope for was usually approximate equations, and 
we would have to reconcile ourselves to an absence of strict logic...” (Dirac, 1977) 


On the other hand, the theory had to be ‘beautiful’. In the same memoir, he writes about 
his meeting with Schrödinger: 


“...of all the physicists that I met, I think Schrödinger was the one that I felt to be 
most closely similar to myself. I found myself getting into agreement with Schrédinger 
more readily than with anyone else. I believe the reason for this was that Schrédinger 
and I both had a very strong appreciation of mathematical beauty, and this appreciation 
of mathematical beauty dominated all our work. It was a sort of faith with us that any 
equations which described the fundamental laws of Nature must have great mathematical 
beauty in them. It was like a religion with us. It was a very profitable religion to hold, and 
can be considered as the basis for much of our success.’ (Dirac, 1977) 


13.2 Dirac and The fundamental equations of 


quantum mechanics (1925) 
|| 


Picking up the story from the end of Chap. 11, Heisenberg left the decision about whether 
or not to publish his revolutionary paper of 1925 to Born, since he had to leave for 
Cambridge where he was to deliver a lecture to the Kapitsa Club on 28 July 1925. These 
gatherings had been instituted by Piotr Kapitsa and at them the latest developments in 
physics were discussed by the staff and research students of the Cavendish Laboratory 
and other cognate departments. Dirac was a member of the club. The title of Heisenberg’s 
talk was Termzoologie und Zeemanbotanik and was principally concerned with his recent 
researches into the anomalous Zeeman effect. Dirac had only vague memories of the 
meeting, but some discussion of Heisenberg’s most recent work on the fundamentals of 
quantum mechanics took place either during the lecture or in the informal discussions 
afterwards. Ralph Fowler, who was Dirac’s supervisor, was impressed by what Heisenberg 
described about this new work and asked Heisenberg to send copies of the proofs of his 
paper when they became available. This duly happened a few weeks later and Fowler sent 
the proofs to Dirac, who had returned home to Bristol for part of the vacation, seeking his 
opinion on Heisenberg’s work. 
In Dirac’s words, 


‘I received it (the proofs) either at the end of August or the beginning of September... At 
first I was not very impressed by it. It seemed to me to be too complicated. I just did not 
see the main point of it, and in particular his determination of quantum conditions seemed 
to me too far fetched, so I just put it aside as being of no interest. However, a week or ten 
days later I returned to this paper of Heisenberg’s and studied it more closely. And then I 
suddenly realised that it did provide the key to the whole solution of the difficulties which 
we were concerned with. 
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My previous work had been all concerned with studying individual states . . . Heisenberg 
brought out the quite new idea that one had to consider quantities associated with two 
stationary states rather than one.’ 


But this scarcely does justice to the profound impact Heisenberg’s paper had upon Dirac. He 
quickly homed in on Heisenberg’s discovery that the variables involved at the quantum level 
do not commute. Dirac’s great achievement in his paper of 1925 was to recast Heisenberg’s 
insights into the language of Hamiltonian dynamics. As expressed by Jammer, 


‘In a few weeks’ time he achieved his objective and thus established one of the most 
profound and useful relations between quantum mechanics and the classical Hamilton— 
Jacobi formulation of mechanics.’ (Jammer, 1989) 


13.2.1 Reformulating non-commutability 


Like Born, Dirac appreciated that the deep insights in Heisenberg’s paper were the introduc- 
tion of quantities which depended upon two variables rather than one and the fact that these 
quantities were non-commuting, the feature which greatly disturbed Heisenberg. Born’s 
analysis led to the development of matrix mechanics which was the subject of Chap. 12. 
Quite independently and entirely on his own, Dirac discovered a rather different and, in the 
end, more powerful approach to the reformulation of quantum mechanics. An important 
aspect of Dirac’s approach was that he firmly believed that, whatever the correct formulation 
of the theory, it should be derivable from Hamiltonian mechanics, a subject in which he 
was already an expert. Like most theorists at the time, he had become fluent in the use of 
the techniques of Hamilton-Jacobi theory and of action-angle variables in the old quantum 
theory. As he remarked much later, 


‘At that time, I was expecting some kind of connection between the new mechanics and 
Hamiltonian dynamics (as Hamiltonian dynamics was used so much with Sommerfeld’s 
development of the Bohr theory) and it seemed to me that this connection should show 
up with large quantum numbers.’ 


Notice again Bohr’s correspondence principle at work. 

To appreciate what Dirac did, let us recall the Heisenberg prescription for taking the 
product of the quantities x and y, what Dirac refers to as the ‘Heisenberg product’. Following 
Jammer (1989), consider the one-dimensional classical case, that is, a system of one degree 
of freedom, in which x and y are functions of the action—angle variables J and w. As in 
(11.13), for a multiply periodic system, x(n, t) and y(n, t) can be written as Fourier series: 


x = x(n, t) = x(J, w) = Y` x-(J) exp 2ritw) , (13.1) 
y = y(n, t) =y, w) = $ ye(J) exp ritu), (13.2) 


where, according to the old quantum theory, J = nh. In Heisenberg’s scheme, the quantities 
Xı(J) and y,(J/) correspond to the quantum quantities x(n, n — t) and y(n, n — t). Dirac 
was interested in the limit of large quantum numbers n > T, expecting that in this limit he 
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would find the classical analogue of xy — yx. The (n,n — tT — 0) component of xy — yx 
is given by the difference of Heisenberg products 





(xy — YX)nn-ı-o = [x(n, n -t)y(n—-t,n—-t-o)—-y(n,n—-o)xn-o,n—-tr-o)] 
x &xplio(n,n—-t-o)t]. (13.3) 
Concentrating on the terms in the first pair of square brackets on the right-hand side 


of (13.3), this expression can be rewritten by subtracting and adding x(n — 0,n —t — 
o) y(n — t,n — T — o) to each of the terms so that 





x(n,‚n—-Tt)y(n—-t,n—-t-o)—-y(n,n—-o)xn-o,n—-tr-o) 





= [x(n,n—-t)-x(tn-o,n—-o-—-rt)]|y(n—-t,n-t-o) 








Din,n—-o)-y(n—-t,n-tr-o)]x(n-o,n—-tr-o). (13.4) 


Now, working backwards, these Heisenberg products can be converted into their classical 
equivalents for large quantum numbers n > t + o using the equivalences x(n, n — T) > 
x,(J) and y(n, n — 0) > yo(J). Then, 











x(n,‚n—-Tt)-x(n-o,n-o-r) > x,(J)—x(J —ho)= N, 
(13.5) 
N) 
Yn,n—-o)-yna-t,n-t-o) > Yə(J)— Yə(J -Ar)= g 
Therefore, (13.4) becomes 
əx: (J Oyr(J 
BN oie: ot) sl. (13.6) 
ðJ 
But, from the definitions of the o and t components of x and y, 
ð 
— [yo(J) exp(27iow)] = 2riloys(J)exp(2riow)] , 
er (13.7) 


2 [x:(J)expQ@rirtw)] = 2xri[ty:(J)exp(2ritw)] . 


Combining (13.6) and (13.7), it follows that the nm component of the quantum mechanical 
expression for xy — yx corresponds to 
h 


i > E [xr expQrirw)] z Pe exp(2rrio w)] 


Tto=n—m 
0 . ð : 
— — [yo &p(2rrio w)] — [x &p2rirw)]r , (13.8) 
J ow 


where we have dropped the arguments (J) in x; and yo. The result is that the quantity 
xy — yx is identical to the quantity 


h (dy ox dx dy 
13. 
(3 ðw os z) i ae 





which is the classical expression for the Poisson bracket of the quantities x and y, which 
are functions of the canonical variables J and w. 
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There is a delightful story about how Dirac realised the significance of Poisson brackets 
during the exciting days of October 1925. Dirac had a strict rule about relaxing on Sunday 
afternoons by taking country walks around Cambridge. In his memoirs, he writes: 


‘It was during one of the Sunday Walks in October 1925 when I was thinking very 
much about this wv — vu, in spite of my intention to relax, that I thought about Poisson 
brackets . . . I did not remember very well what a Poisson bracket was. I did not remember 
the precise formula for a Poisson bracket and only had some vague recollections. But there 
were exciting possibilities there and I thought I might be getting on to some big new idea. 

Of course, I could not [find out what a Poisson bracket was] right out in the country. I 
just had to hurry home and see what I could then find out about Poisson brackets. I looked 
through my notes and there was no reference there anywhere to Poisson brackets. The text 
books which I had at home were all too elementary to mention them. There was nothing I 
could do, because it was Sunday evening then and the libraries were all closed. I just had 
to wait impatiently through that night without knowing whether this idea was any good or 
not but still I think my confidence grew during the course of the night. The next morning, 
I hurried along to one of the libraries as soon as it was open and then I looked up Poisson 
brackets in Whittaker’s Analytic Dynamics [(1917)] and I found that they were just what 
I needed. They provided the perfect analogy with the commutator.’ 


13.2.2 Poisson brackets 


Like so many topics in analytic dynamics which have appeared in this story, Poisson 
introduced what are now known as Poisson brackets in the course of a study of an n-body 
problem in astronomy, namely, the motion of a planet about the Sun in the presence of the 
(n — 1) other planets. His problem was to solve the 3n Euler-Lagrange equations of motion 
(5.59) in the presence of the time-varying perturbation Q of the gravitational potential due 
to the other planets. The full significance of Poisson’s pioneering studies was only fully 
appreciated later in the century when Jacobi independently discovered the power of Poisson 
brackets. Jacobi referred to Poisson’s combination rule for Poisson brackets! as ‘the most 
profound discovery of Monsieur Poisson’ and ‘the most important theorem in dynamics’ 
(Jacobi, 1841). 
The Poisson bracket? of the functions g and h is defined to be 


"(dg dh dag dh 
‚hl= ; 13.10 
nn > (= Ogi Ogi im) ( ) 





i=l 
where g and h are functions of p; and g;, the generalised, or canonical, momentum and 
position coordinates of the n degrees of freedom of the system. In general, we can write 
the time variation of, say g, as 


3 ag. Og 
= (= en ODi 5) ( ) 








i=l 
Hamilton’s equations (5.69) enable us to replace g and p by the derivatives of the Hamil- 
tonian with respect to p; and q; and so 


“./dg dH dg dH 
g=) (5 & ) = 17,8). (13.12) 
i Ogi 9pi Opi Odi 
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Therefore, Hamilton’s equations can be rewritten 
ji =[H,q]; p= UA pil. (13.13) 


What Dirac appreciated was that the Poisson brackets have exactly the commutation prop- 
erties needed to describe the non-commutation features of the difference of Heisenberg 
products xy — yx. Thus, in (13.10), if we set g = q; and h = q;, or g = pi and h = pj, 
we obtain the results 


[gi.qj]=0, [p pj] =9, (13.14) 
because of the independence of the g;s and p;s. Furthermore, if j 4 k, 
[Pj] =0. (13.15) 


If, however, g = px and h = qx, then 


[Pk del =1, qj, p] =—1. (13.16) 


Pairs of quantities with zero Poisson brackets are said to commute, whereas those with 
Poisson brackets equal to unity are non-commuting. Pairs of quantities with Poisson brackets 
equal to unity are said to be canonically conjugate. From (13.13), it follows that any quantity 
which commutes with the Hamiltonian H does not change with time. In particular, H is a 
constant of the motion since it commutes with itself. 

But the Poisson brackets had other key properties. Most important was the fact that all 
sets of dynamical variables Q; and P which are obtained from g, and p by a canonical 
transformation which leaves the Hamilton equations of motion unaltered, also leave their 
Poisson brackets unchanged. We can summarise (13.14)-(13.16) by the expressions 


[9x, Pi] = Sui » [9x, Gi] = [pr, pi] =9, (13.17) 


where ôx; is the Kronecker symbol meaning ôx; = 1 if k =i and zero otherwise. Then, if 
O, and P; are obtained from qx and p+ by a canonical transformation, it follows that 


[Ox, Pi] = Ski , [Ox, Qi] = [Pk, P] = 0. (13.18) 


Furthermore, the Poisson brackets of any two functions F; and F remain invariant under 
all such transformations, 


” (dF dF, dF, 0F " (dF, dF, dF, OF 
> 1 OF? \-I( 1 OF ) ade, 
i=l Opi Ogi aqi op; = OP; 90; 90; op; 








These and many other properties of Poisson brackets were included in Whittaker’s Ana- 
lytic Dynamics and Dirac immediately set about rewriting Heisenberg’s insight in terms 
of Hamiltonian dynamics. The great advance was that Dirac could now write down the 
quantisation conditions in terms of classical Hamiltonian dynamics, using the equivalence 
of the difference of Heisenberg products to Poisson brackets. From (13.9), this equivalence 
can be written 





h (= dy oy z) 
xy — yx = 


h 
= 13.2 
mus dwad) mr a 
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where the differentiations are taken with respect to the canonical coordinates J and w. 
Thus, for the special cases of canonical coordinates p and q, it follows that 


ih 
995 — 989 =Q,  PrPs— PsPr=9, Gr Ps — As Pr = brs 5 : (13.21) 


There is another key feature of Dirac’s approach to quantum mechanics. Notice that the 
procedure embeds quantum mechanics into the very heart of Hamiltonian dynamics. Just 
as Heisenberg appreciated that he had applied quantum concepts to space itself, so Dirac 
had done the same thing by treating the momentum and space coordinates on the same 
footing and introducing Bohr’s quantisation condition into the foundations of Hamiltonian 
mechanics. As expressed by Jammer (1989) 


‘... Dirac absorbed, so to speak, the correspondence principle as an integral part of the 
very foundation of his theory and, like Heisenberg, disposed thereby of the necessity of 
resorting to Bohr’s principle each time a problem had to be solved.’ 


13.2.3 The fundamental equations of quantum mechanics (1925) 


Dirac quickly wrote his paper with the above title and showed it to Fowler who fully 
appreciated its importance. Fowler communicated the paper to the Royal Society for rapid 
publication in the Proceedings of the Royal Society and it was received on 7 November 
1925 — it was published on 1 December 1925. Dirac’s perspective was clearly laid out in 
the introduction: 


‘In a recent paper, Heisenberg [(1925)] puts forward a new theory which suggests that it 
is not the equations of classical mechanics that are in any way at fault, but that the mathe- 
matical operations by which physical results are deduced from them require modification. 
All the information supplied by the classical theory can thus be made use of in the new 
theory.’ 


After summarising Heisenberg’s innovations, Dirac rewrites the Heisenberg product as 
follows, 


xy(nm) = ) x(nk) y(km) . (13.22) 
k 
Born had made the same simplification and realised that (13.22) represented matrix multi- 
plication. Dirac took the more general view that (13.22) should be the multiplication rule 
for the quantum variables x and y without specifying at this stage exactly what they were. 
They had however to obey the addition rule 


{x + y}(nm) = x(nm) + y(um) , (13.23) 


as in the case of matrix addition. But now, Dirac takes a different route. He considers 
the rules for differentiation of the quantum variables x and y which are functions of the 
parameter v. As discussed by Mehra and Rechenberg, he now applied his knowledge of 
projective geometry to the problem of defining the differentiation dx/dv of the quantum 
variable x. His key insight was that there should be a linear relation between dx /dv and 
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x. This is consistent with the fact that both dx/dv and x can be represented by Fourier 
series of the form (13.1). Therefore, he assumed that the (nm) component of dx /dv could 
be written as 





z = X a(nm;n'm'’)x(n'm’). (13.24) 

He then argued that quantum differentiation should satisfy the following conditions: 
Taty ort Ey, (13.25) 
a) _ Ir ale L, (13.26) 


where the order of multiplication of the variables in (13.26) has to be preserved. Dirac now 
carried out an analysis of the constraints on the process of quantum differentiation with 
respect to v assuming that the rules (13.24) and (13.25) had to be maintained. From this 
analysis, he concluded that the most general form of quantum differentiation was 


= = xa — ax, (13.27) 
dv 


where the new quantum variable a has components a(nm). Thus, the differential of the 
quantum variable x with respect to any parameter v is expressed as the difference of 
the Heisenberg products of the quantum variables x and a. This key result showed that the 
differential equations of classical mechanics were to be replaced by algebraic equations 
involving the addition and multiplication of quantum variables. Dirac was now in a position 
to apply the full apparatus of classical Hamiltonian dynamics to quantum mechanics. 

Particularly notable was his derivation of Bohr’s quantisation condition according to the 
new formalism. Just as in the case of matrix mechanics, ‘diagonal’ terms of the form C (nn) 
correspond to constants of the motion. Therefore, the Hamiltonian terms H (nn) correspond 
to the energies of the stationary states of the system. From (13.13), it follows that 


x =[H,x]. (13.28) 
Combining this with the fundamental relation (13.20), 
ih 
xH — Hy = —[x,H], (13.29) 
27 
it follows that 
ih. h 
x(nm)H(mm) — H(nn)x(nm) = am) = a(nm)x(nm), (13.30) 
T 7 


since the time dependence of x(nm) is entirely determined by the term explio(nm)t]. 
Hence, 


hv(nm) = Z om) = H(nn)— H(mm) = E, — En , (13.31) 


where E,, and En are the energies of the stationary states n and m. 
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Fowler fully appreciated the deep significance of Dirac’s paper and this was corroborated 
by the response of Heisenberg, to whom Dirac had sent a manuscript copy of his paper. On 
20 November 1925, Heisenberg wrote to Dirac as follows: 


‘T have read your extraordinarily beautiful paper on quantum mechanics with the greatest 
interest, and there can be no doubt that all your results are correct as far as one believes 
at all in the newly proposed theory. ... [Dirac’s paper was] also really better written and 
more concentrated than our attempts here [in Göttingen]. 


Heisenberg particularly liked the derivation of Bohr’s frequency condition, which used 
rather general principles rather than a specific model of atomic systems. Dirac was well 
aware that Born, Heisenberg and Jordan had developed their matrix mechanical approach 
to quantum mechanics and so he immediately turned all his energies to perfecting his own 
rather different approach. 


13.3 Quantum algebra, q - and c-numbers and the hydrogen atom 
See eee eee 


Dirac had stressed the formal algebraic nature of the quantum variables in his paper of 
1925, but he now formalised the concept by introducing a distinction between the quantum 
variables, which he called g-numbers, and the ordinary numbers of arithmetic, which he 
called c-numbers. As he wrote, ‘q stands for quantum, or maybe queer, and c for classical, 
or maybe commuting’ (Dirac, 1977). He makes this distinction clear in the introduction to 
his paper (Dirac, 1926d). 


‘The fact that the variables used for describing a dynamical system do not satisfy the 
commutative law means, of course, that they are not numbers in the sense of the word 
previously used in mathematics. To distinguish the two kinds of numbers, we shall call 
the quantum variables g-numbers and the numbers of classical mathematics which satisfy 
the commutative law c-numbers.. . 

At present one can form no picture of what a g-number is like. One cannot say that 
one q-number is greater or lesser that another... One knows nothing of the processes by 
which the numbers are formed except that they satisfy all the ordinary laws of algebra, 
excluding the commutative law of multiplication...’ 


In his reminiscences, he wrote 


“Now, I did not know anything about the real nature of g-numbers. Heisenberg’s matrices 
I thought were just an example of g-numbers; maybe q-numbers were something more 
general....I proceeded to develop a theory in which I felt free to make any assumption 
I wanted to, unless they led immediately to an inconsistency. I did not bother at all about 
finding a precise mathematical nature for g-numbers, or any kind of precision in dealing 
with them.’ (Dirac, 1977) 


In this paper and three succeeding papers of 1926 (Dirac, 1926b,c,e), Dirac put the em- 
phasis upon the new algebra necessary to accommodate Heisenberg’s rules for quantum 
multiplication, the last paper published in the Proceedings of the Cambridge Philosophical 
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Society having the title On quantum algebra (Dirac, 1926b). We recall the remark at the 
beginning of his paper of 1925 that 


‘[it is] not the equations of classical mechanics that are in any way at fault, but that 
the mathematical operations by which physical results are deduced from them require 
modification.’ 


In other words, it is not the forms of the laws of motion which are at fault — the wrong 
algebra is being used to carry out the mathematical operations. 

Dirac had exactly the right training for making this imaginative leap in the dark. From the 
beginning he had a strong mathematical bent with a particular interest in geometry. After 
his engineering degree at Bristol University, he studied for a second degree in mathematics, 
which he completed in two rather than three years A particularly important influence was 
Peter Fraser, an outstanding mathematics lecturer, from whom Dirac learned the importance 
of rigorous mathematics, as opposed to the non-rigorous tools commonly used by engi- 
neers and physicists. He also learned projective geometry which was to have an important 
influence on his thinking. 

At Cambridge, his supervisor was Ralph Fowler, the leading quantum theorist in Cam- 
bridge who stimulated his interest in atomic problems. At the same time, he participated 
regularly in the Saturday afternoon geometrical tea parties held by Henry Baker, the Lown- 
dean Professor of Astronomy and Geometry. All Baker’s students were required to attend 
these parties and there were lectures and discussions of geometrical topics after tea — Dirac 
presented his first lecture at one of these. 

Baker’s masterpiece was his six-volume Principles of Geometry, published from 1922 
to 1925, volume one of which was to be of special significance for Dirac (Baker, 1922). In 
his memoirs, Dirac wrote: 


“... I was always very much interested in geometry . . . You can divide all mathematicians 
into two classes, those whose main interest is geometry, those whose main interest is 
algebra.... Now, a good mathematician has to be a master of both geometry and of 
algebra, and he has to be able to pass from one to the other quite freely according to the 
nature of the problem that he is working on... . my preference was strongly on the side of 
geometry, and has always remained so.’ 


Later, he remarks 


‘[Projective geometry] was a most useful tool for research, but I did not mention it in my 
published work. . . . I felt that most physicists were not familiar with it. When I obtained a 
particular result, I translated it into an analytic form and put down the argument in terms 
of equations. That was an argument which any physicist would be able to understand 
without having had this special training.’ 


Although projective geometry had applications in relativity theory, no-one had considered 
that it would have importance for quantum theory. Dirac, however, appreciated a key result 
contained in Baker’s book, namely, that there existed a complete set of laws for ‘non- 
commuting’ numbers, similar to those of ordinary commuting numbers. 

Section III of Chap. 1 of Baker’s first volume describes the elements of symbolic algebra 
needed to represent geometrical operations (Baker, 1922). He uses the term symbol rather 
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than variable since the algebraic rules should apply for a wide range of mathematical 
entities, for example, vectors, matrices and so on — he gives examples of various types of 
symbols on pp. 68-69. These examples include 2 x 2 matrices with complex components 
which were eventually to become Dirac spin matrices. Specifically, he demonstrates with 
these examples that ‘the equation ab — ba is not generally true’. Baker stated the basic 
axiom as follows: 


‘The symbols which we first introduce are, speaking in general terms, subject to all 
the laws of ordinary algebra, except the commutative law of multiplication. They are 
not necessarily, and will not be finally, arranged in order of magnitude, so that, in their 
entirety, they are wider than the real numbers of arithmetic.’ 


It is remarkable that Dirac takes over, almost word-for-word, the foundations of Baker’s 
symbolic algebra as the basis for his rules of quantum algebra. Pages 62-69 of Baker’s 
book indicate clearly the source of Dirac’s insights and of his translation of the symbols of 
projective geometry into the variables of quantum algebra. 

Thus, if z1, z2 and z3 are g-numbers, they obey the rules of ordinary algebra as follows: 


14+72=22+271, 
(21 +22) +23 = 21+ (22 +23), 
(2122)23 = 21(2223) , 
zı(z2 + 23) = zız2 + 2123, (21 + 22)23 = 2123 + 20273 . 
If 
zjz2=0, either z}=0 or z2=0; 

but 

2122 #2221, 


in general, except when z or z2 is a c-number. 

In Sect. 2, the rules of Poisson bracket manipulations are adapted for g-numbers. In 
particular, the full apparatus of Hamiltonian mechanics can be taken over into his quantum 
algebra of g-numbers. Thus, if Q, and P, are canonical variables, which are functions of 
q-numbers, and these obey the rules for Poisson brackets 


[O-, Ps] = brs , [Qr Qs] =[P.,Ps]=0, (13.32) 


then, just as in Hamiltonian mechanics, Q, and P, can be transformed to another set of 
canonical variables q, and qs by relations of the form 


Q, == bq, b7! Š P, = bp, b! Fy (13.33) 


where b is a g-number. Notice the strong similarity to the canonical transformations (12.64) 
introduced by Born and his colleagues in their matrix treatment of quantum algebra. 
Interestingly, Dirac remarks that ‘these formulae do not appear to be of great practical 
value’, whereas they were central to Born’s procedures for the transformation to principal 
axes, that is, the diagonalisation of the matrices from which the energies of the stationary 
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states could be determined. By the same arguments, the equations of motion could be 
written 


<=, H], (13.34) 


where the Hamiltonian H is now a g-number. Dirac does not define what g-numbers are 
until he is forced to confront the theory with experiment. He does however concede that, 
in the case of multiply periodic motion, they can be represented by a set of harmonic 
components of the form x(nm) exp iw(nm)t, where x(nm) and w(nm) are c-numbers. Then 
x has components iw(nm) x(nm) exp iw(nm)t. 

Before proceeding to the application of this algebra to quantum mechanics, Dirac de- 
velops a number of algebraic theorems which illustrate the care which has to be taken in 
manipulating g-numbers, in particular, the order in which the operations are carried out. 
Thus, for example, he establishes the results that 


1 1 1 d (+) 1.1 
Se 5 = X (13.35) 
xy y x dt \x x x 
The binomial expansion of (1 + x)” is the same as in ordinary algebra where n is a c-number 
and the exponential series can be defined as the same power series as in ordinary algebra. 
Generally, however, e**” 4 e* e”, unless x and y commute. 

Dirac now needed to show that the new formalism could be used to carry out useful 
computations which could be compared with known results in quantum physics. This was 
not a trivial task since there were relatively few problems which could be tackled with 
the state of development of quantum algebra at the time. In his paper entitled Quantum 
mechanics and a preliminary investigation of the hydrogen atom (Dirac, 1926d), Dirac 
tackled the determination of the energy levels of the hydrogen atom. To do this, he had to 
convert the classical theory of multiply periodic systems into the language of g-numbers 
and quantum algebra. He started with the advantage that he was already an expert on 
the mathematical procedures of action—angle variables and their application to multiply 
periodic systems. Section 4 of the paper carries out this translation. On the basis of his 
new quantum algebra, he set out the postulates of the quantum theory of multiply periodic 
dynamical systems where the action—angle variables J, and w, are now g-numbers and so 
have to obey the commutation relations. 

Dirac showed that, using his new formalism he could work out the frequency of the orbits 
of electrons in the hydrogen atom and Bohr’s quantum relation, noting that the quantities 
involved were g-numbers rather than the c-numbers of the old quantum theory. In addition, 
he derived Heisenberg’s multiplication rule for the quantities x and y, 


xyn,n-y)= X x(n, n—a)y(n-—a,n—y). (13.36) 


a 





Sections 5, 6 and 7 are concerned with solving the orbital equations for the electron in 
the hydrogen atom, but now using all the rules of quantum algebra. This analysis required 
considerable care but resulted in a Hamiltonian of the form 


1 kiko e? 
H= 2 leer ; 13.37 
2m (r; Ty ) 4T Eor ( ) 
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where kı and kz are q-numbers given by kı = k + 4h and ky = k — Sh, k being the q- 
number which describes the angular momentum, which is the conjugate momentum asso- 
ciated with the angle variable 0. 

After a detailed analysis, Dirac obtained the expression for the energies of the stationary 
states of the hydrogen atom and the transition frequencies of the hydrogen atom in the 
form,? 





4 
„_ HP +nh)- HP) _ me l 1 l | (13.38) 


h geh? |P? (P+nh) 


where P = mh and m takes integral values. This is precisely the formula for the lines in 
the frequency spectrum of the hydrogen atom. 

Dirac’s was no less of a tour de force than Pauli’s. They had both completed calculations 
of very considerable complexity and, by quite different approaches — Pauli by applying the 
concepts of matrix mechanics and Dirac by invoking his rules of quantum algebra — arrived 
at the same result which agreed with experiment. In his correspondence with Heisenberg, 
Dirac had been informed that Pauli had already solved the problem of the hydrogen atom 
using matrix mechanics,’ but this did not worry Dirac particularly. From his point of view, 
this was only a ‘reality check’, that quantum algebra was not just a theoretical construct, 
but a version of quantum mechanics which could account for the results of experiment. 


13.4 Multi-electron atoms, On quantum algebra and 


a PhD dissertation 
——s sepe_cÖvbrbrozrsm— am —m——— 


Dirac’s objective all along had been to use the formalism of quantum algebra to tackle the 
problems of multi-electron atoms and this he proceeded to carry out, the end result being 
the paper with the unpromising title The elimination of the nodes in quantum mechanics. 
Once again, the inspiration had come from celestial mechanics and the problem ofthe orbits 
of point masses acting under gravity. As expressed by Dirac, 


‘In the classical treatment of the dynamical problem of a number of particles or electrons 
moving in a central field of force and disturbing one another, one always begins by 
making the initial simplification, known as the elimination of the nodes, which consists 
in obtaining a contact transformation from the Cartesian co-ordinates and momenta of 
the electrons to a set of canonical variables, of which all except three are independent 
of the orientation of the system as a whole, while these three determine the orientation. 
In the absence of an external field of force, the Hamiltonian, when expressed in terms of 
the new variables, must be independent of these three, which simplifies the equations of 
motion.’ (Dirac, 1926e) 


The objective of the paper was to determine the necessary contact transformation to 
enable the programme to be carried out. First of all, Dirac derived the properties of the q- 
numbers representing linear and angular momentum in his quantum algebra and obtained 
independently the rules for the quantisation of angular momentum which had already 
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been established by Born, Heisenberg and Jordan (1926) (see Sect. 12.3.3). Next, the 
appropriate forms of the action—angle variables are determined. Once this was achieved, 
the transformation equations were determined, first for a single electron, then for a system 
of two electrons and finally for systems with more than two electrons. In the last case, the 
multi-electron atom was modelled by the ‘core plus valance electron picture’ so that the 
problem was reduced to a ‘two-electron’ problem (see Sect. 7.4). 

The paper is once again a technical tour de force. Of particular interest is Dirac’s treat- 
ment of the anomalous Zeeman effect in Sect. 9. He uses the standard atomic model of an 
electron plus core and, following the experimental evidence for the anomalous value of the 
gyromagnetic ratio (Sects. 7.5.2), gave the core twice the ratio of magnetic moment to an- 
gular momentum as compared with the standard Lorenz value (7.39). With this assumption, 
he derived an expression for the Lande g factor 


mE (: „la -Ak the) . 
2 JR 

which agreed precisely with Land&’s empirical expression discussed in Sect. 7.4 and given 
by (7.31). Finally, he showed that the theory also gave Kronig’s results for the relative 
intensities of the multiplets and their components in a weak magnetic field. 

The fourth paper was his short contribution On quantum algebra to the Mathematical 
Proceedings of the Cambridge Philosophical Society (Dirac, 1926b). This was a formal 
description of the algebra of g-numbers stated axiomatically and concisely. It summarises 
in one paper the necessary formalities for the application of quantum algebra to physics at 
the quantum level. 

Dirac submitted his PhD dissertation with the title Quantum mechanics in May 1926, 
including in it the four papers discussed in this chapter (Dirac, 1926a). This was aremarkable 
achievement in that it provided a complete and self-consistent theory of quantum mechanics, 
all discovered in the previous nine months. The examiners included Arthur Eddington and 
they were very impressed by the dissertation. In June 1926, Eddington took the unusual step 
of writing personally to Dirac to congratulate him warmly on his remarkable achievement 
(Farmelo, 2009). 

This brought to a close Dirac’s first burst of creativity in developing the fundamentals 
of quantum theory. Soon, he was to add the quite different approach pioneered by Erwin 
Schrödinger to his armoury of weapons for attacking the problems of quantum theory. 





(13.39) 
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On 13 March 1926, the first of six papers on wave mechanics by Erwin Schrédinger was 
published in the Annalen der Physik with the title Quantisation as an eigenvalue problem 
(Part 1)! (Schrödinger, 1926b). The startling first paragraph reads: 


‘In this paper, I wish to consider, first, the simple case of the hydrogen atom (non- 
relativistic and unperturbed), and show that the customary quantum conditions can be 
replaced by another postulate, in which the concept of “whole numbers”, merely as such, 
is not introduced. Rather, when integralness does appear, it arises in the same natural way 
as it does in the case of node-numbers of a vibrating string. The new conception is capable 
of generalisation, and strikes, I believe, very deeply at the true nature of quantum rules.’ 


These papers were the fruits of an extraordinary burst of creativity on Schrédinger’s part 
which resulted from his interactions with Einstein in the latter part of 1925. Central to 
these exchanges were de Broglie’s remarkable researches which culminated in his famous 
PhD dissertation and published papers of 1924. These events have already be recounted in 
Chap. 9. The subsequent developments which led to Schrédinger’s discovery of the equation 
which bears his name will be taken up in Sect. 14.2, but let us first understand more about 
Schrédinger’s background. 


14.1 Schrodinger’s background in physics and mathematics 


261 


14.1.1 Education and career up to 1925 


Unlike Heisenberg, Jordan, Pauli and Dirac, Erwin Schrédinger was not one of the young 
Turks who developed Knabenphysik, young man’s physics. In 1926, they were all aged 
about 25, while Schrödinger was 38, more or less of the same generation as Born who was 
five years his senior. Schrödinger was born in 1887 in Vienna, which had become a leading 
centre for experimental and theoretical physics in the latter years of the nineteenth century. 
In 1866, Josef Stefan had become director of the Physical Institute of the University of 
Vienna where his doctoral students included Ludwig Boltzmann, Marian Smoluchowski 
and Johann Josef Loschmidt. In turn, Stefan and Boltzmann were the teachers of Friedrich 
Hasenöhrl who was to succeed Boltzmann as Professor of Theoretical Physics at the 
University of Vienna. Hasenöhrl was the only Austrian to participate in the 1911 Solvay 
Conference on quanta. Following his tenure of the Chair of Physics at Prague, Ernst Mach 
returned to the University of Vienna in 1895 as Professor of Philosophy. The stage was set 
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for the vehement debates between Boltzmann and Mach on the reality of the existence of 
atoms. 

Schrödinger began his university studies in 1906 as the community of Viennese scientists 
was coming to terms with the tragedy of Boltzmann’s suicide in September of that year. 
Schrödinger studied physics at the University of Vienna between 1906 and 1910 under Franz 
Exner in experimental physics and Hasenöhrl in theoretical physics. He was recognised as 
an outstanding student and, following a year’s military service, became an assistant at 
Exner’s Physical Institute in 1911. Having obtained his Habilitation in January 1914, he 
was appointed a Privatdozent at the University of Vienna where he carried out a wide range 
of studies in both experimental and theoretical physics. 

During the First World War, Schrédinger participated in the war effort as a commissioned 
officer in the Austrian fortress artillery. Tragically, Hasenöhrl was killed during the hostili- 
ties in October 1915. Later, on receiving the Nobel Prize for Physics in 1933, Schrédinger 
wrote in his autobiographical notes, 


‘At that time, Hasenöhrl was killed in battle, and I feel that otherwise today his name 
would now stand in place of mine.’ (Schrödinger, 1935) 


In 1917, still on military duty, he returned to Vienna as an instructor in meteorology for 
anti-aircraft officer-candidates. He was able to continue his researches during the War years, 
producing 10 scientific papers on a variety of themes. In 1919, he completed what was to 
prove to be his last paper on experimental physics — from this time onwards, his scientific 
papers were entirely of a theoretical nature. 

With the collapse and breakup of the Austro-Hungarian empire at the end of the First 
World War and the poor prospects of obtaining a position in Austria, Schrödinger decided 
that his future lay in Germany and in the spring of 1920 he became the assistant to Max Wien, 
the brother of Wilhelm (Willy) Wien, at the University of Jena. This was the beginning 
of a period of continuous transfer between one university and another as he moved up 
the academic ladder. In September 1920, he was appointed an extraordinary Professor of 
Theoretical Physics at the Technische Hochschule in Stuttgart and then in the spring of 
1921 to an ordinary professorship at the University of Breslau. Finally, in October 1921 he 
was called to the Professorship of Theoretical Physics at the University of Zürich where 
he was to remain for the next six years — during this period he carried out his pioneering 
researches in wave mechanics. 


14.1.2 Scientific accomplishments up to 1925 


Schrödinger had received a very thorough training in experimental and theoretical physics at 
the University of Vienna. The most influential of his teachers was Hasenörhl whose excellent 
lectures included the foundations of analytic mechanics, the dynamics of deformable bodies, 
special attention being paid to the solution of partial differential equations and eigenvalue 
problems, as well as Maxwell’s equations, electromagnetic theory, optics, thermodynamics 
and statistical mechanics. It is striking that Schrédinger’s interests in physics spanned a 
very wide range of topics, whilst the problems of quantum physics occupied a relatively 
minor role in his early work. He published papers on the kinetic theory of magnetism, on 
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dielectrics and anomalous dispersion, atmospheric electricity and the origin of penetrating 
radiation. The last topic preceded the discovery in 1912 of the cosmic radiation by Victor 
Hess, who was first assistant to Stefan Meyer, director of the Institut fiir Radiumforschung in 
Vienna (Hess, 1913). Schrödinger had worked on the height dependence of the penetrating 
radiation assuming it originated in the Earth — Hess showed conclusively that the radiation 
was extraterrestrial and made reference to Schrödinger’s paper in his article. 

More germane to the present story was Schrödinger’s research on the atomic structure of 
solids. Following the pioneering researches on the specific heat capacities at low tempera- 
tures by Einstein and Debye, the lattice theory of the structure of solids had been the subject 
of important papers by Born and von Karman (Born and von Karman, 1912). This work 
was elaborated theoretically by Schrédinger in his first important paper under the title On 
the dynamics of elastically coupled point systems (Schrödinger, 1914). The theory required 
the use of advanced methods in the eigenvalue theory for infinite systems, but this posed 
little difficulty for him because of his training in theoretical physics under Hasenörhl. In 
his courses, HasenGhrl had treated 


‘the higher theoretical schemes of mechanics as well as the problem of eigenvalues in 
continuum physics.’ (Schrédinger, 1935) 


During the War years, Schrédinger studied Smoluchowski’s theory of fluctuations and 
general relativity. Towards the end of the War, he wrote a substantial review of atomic 
and molecular specific heats which necessarily involved the introduction of quantisation to 
understand their low temperature behaviour (Schrédinger, 1919). He continued his broad 
interests in physics and other disciplines such as the theory of colours and colour perception. 
He maintained an interest in quantum problems, for example, in his paper on penetrating 
orbits which was discussed in Sect. 8.2 and which dates from his short stay at Stuttgart 
(Schrédinger, 1921), but he was not one of the main protagonists of quantum theory. It 
therefore came as a complete surprise when Schrédinger discovered wave mechanics which 
at first sight was a completely different approach from the technically complex methods 
developed in Göttingen, Copenhagen and Cambridge. Let us trace the steps which led to 
Schrédinger’s dramatic discovery of his wave equation. 


14.2 Einstein, De Broglie and Schrodinger 
[eae 


The deep significance of Bose’s famous paper of 1924 was fully appreciated by Einstein. 
It described a new type of statistics for counting photons which resulted in an elegant 
derivation of the spectrum of black-body radiation. Einstein realised that the counting 
procedures for indistinguishable particles could be applied equally well to atoms as to 
photons. The procedures described in Sec. 9.2 led to the expression (9.7) for the number of 
particles in the state k 


&k 


wai: (14.1) 


nk = 
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where the constants œ and £ are to be determined and gz is the degeneracy of the energy 
level with energy &; for photons or atoms. On thermodynamic grounds, 8 = 1/kT and a 
is determined by fixing the number of photons or atoms. For black-body radiation, œ is 
set equal to zero, meaning that the photon number is not conserved, but that the number 
density of photons is matched to the total energy under the black-body spectrum — the 
properties of the black-body spectrum are uniquely defined by the single parameter T, the 
thermodynamic temperature of the radiation. 

Einstein carried out an exactly parallel analysis for the atoms of a monatomic gas, but 
now the numbers of particles had to be conserved and so œ had to be a non-zero constant. 
The same statistical procedures as in Sect. 9.2 (Einstein, 1924) were used. The total number 
of available states in momentum space is given by (9.9), but now the momenta of the atoms 
of the gas are given by p? = 2mE, rather than by p = hv/c. Therefore, the volume of 
phase space for a monatomic gas is 


V dp, dp, dp. = VAnp”dp, (14.2) 


where p = (2mE)!/?, dp = 4(2mE) 2 x 2mdE and V is the physical volume within 
which the gas is contained. Since the elementary volume of momentum space is h°, the 
number of available states with energy in the interval E to E + dE in the volume V is 


V Arp? d 
a= e , (14.3) 


and so the number of particles in the energy interval E to E + dE is 


Zk _ An V (2m? E)!/? 
e@+BE _ | _ h? Ga = 1) 





N(E)dE = (14.4) 


In the limit of very low densities and high temperatures, (14.4) becomes the standard 
Boltzmann distribution and so 6 = 1/kT. We can therefore write 

Lk _ An V (2m? E)!/2 
(BeE/kT — 1) 3 (BeE/AT — 1) 





N(E)dE = (14.5) 
where B = e®. 

Einstein fully appreciated the remarkable properties of this new distribution (Einstein, 
1924, 1925). B cannot be less than one or else the number of particles could become 
negative and so B > 1. When 2 is very close to 1, the number of particles in low energy 
states can become very large. In fact, the effect is much more dramatic than this, as Einstein 
demonstrated in his second paper. A version of Einstein’s analysis is given in the endnotes.” 
What had gone wrong was that the continuum approximation (14.3), in which g; was 
replaced by N(E)dE, does not take account of the fact that the zero energy quantised 
state can contain a finite number of particles. The quantisation of the states, which results 
in a ground state with zero energy, has to be included in the calculation. At low enough 
temperatures, T < Tg, where 


Ih? N 2/3 
a E 14. 
oink (sur) eee 
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the particles accumulated in the zero energy state and, as T — 0, all the particles would 
be condensed into the same zero energy state. This is the origin of the phenomenon of the 
Bose-Einstein condensation. The gas is split into two parts, the ‘normal phase’ in which 
the particles are distributed among the excited energy states and the ‘condensed phase’ in 
which all the remaining atoms of the gas are in the quantised ground state. At low enough 
energies, essentially all the particles would be in this condensed phase. Although it was 
not appreciated at the time, this was the first realisation of a phase transition using the 
techniques of statistical mechanics of identical particles.’ 

Einstein appreciated fully that Bose’s statistics were very different from classical Boltz- 
mann statistics. The key difference is that the particles can no longer be considered to be 
statistically independent. As explained in Sect. 9.2, since the particles are indistinguishable, 
each possible configuration is counted only once. In Boltzmann statistics the configurations 
[A|B] and [B|A] are considered as separate realisations of the distribution of distinguishable 
particles, whereas according to Bose-Einstein statistics, if A and B are indistinguishable, 
they are counted only once. This induces a statistical correlation between the particles. In 
his second paper on the Bose-Einstein gas, Einstein was quite clear about the strange nature 
of the new statistics (Einstein, 1925): 


‘[It] therefore expresses indirectly a certain hypothesis about a mutual dependence of the 
molecules, which for the present is of a totally enigmatic nature and which just creates 
the same statistical probability for the cases which are defined here as complexions, 


There were two other suggestive features of the distribution to which Einstein drew 
attention. First, the new distribution was consistent with Nernst’s heat theorem, or the third 
law of thermodynamics, according to which the entropy of all gases tends to zero at zero 
temperature. This is indeed the case for a Bose—Einstein distribution, but not for a classical 
Boltzmann distribution. 

Secondly, just as in the case of black-body radiation, the fluctuations in the number 
density of particles is composed of two parts. The fractional fluctuation for photons is 
given by (3.41) and a similar relation is found for the monatomic gas, 


A\? 1 1 
ee, (14.7) 
Ny Ny Zj 


where n, is the mean number of atoms and z, is the number of cells in phase space with 
energies in the interval E to E + dE. The first term is the usual statistical fluctuation in 
the number of particles according to a Poisson distribution, AN/N ~ N~'/?. The sec- 
ond term arises from fluctuations associated with the superposition of random waves, as 
was demonstrated in Sect. 3.6.2. Again, Einstein fully appreciated the significance of this 
result: 





‘It arises in the case of radiation from interference fluctuations. We may also interpret 
it for gases in a corresponding way by associating with the gas, in a suitable manner, a 
ray phenomenon and then computing the interference fluctuations of the latter.’ (Einstein, 
1925) 


He went on to remark that 
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‘I pursue this interpretation further, since I believe that here we have to do with more than 
a mere analogy.’ 


In December 1924, Einstein had been sent a copy of de Broglie’s doctoral thesis by 
Langevin, who had been one of its examiners. Einstein was profoundly impressed, referring 
to it as ‘a very notable publication’. In his paper of 1925, he suggested that the type of 
matter waves proposed by de Broglie enabled him to interpret the fluctuation term in (14.7) 
for the Bose-Einstein gas and repeated de Broglie’s suggestion that experiments might be 
carried out to search for the interference of matter waves in beams of particles, although 
he realised that the effect would be extremely small. The subsequent story of the discovery 
of electron diffraction by Davisson and George Thomson was recounted in Sect. 9.4, but 
these results were not established at the time when Schrédinger began to take matter waves 
seriously. 

Schrédinger’s correspondence with Einstein began in February 1925 when he was study- 
ing the various expressions for the entropy of an ideal gas at low temperatures. Having 
studied Einstein’s first paper on the equation of state of an ideal gas, he wrote to Einstein 
about his concern that the particle distribution according to Bose-Einstein statistics was 
inconsistent with the Boltzmann distribution. Einstein responded by stating that his concern 
was indeed valid, but there was nothing wrong with his calculation. As Einstein wrote, 


“Your reproach is not unjustified, although I have not made a mistake in my paper. In 
Bose statistics, which I use, the quanta or molecules are not considered as being mutually 
independent objects. 


Schrédinger delayed replying until November 1925 when he confessed that he had not 
appreciated the originality of Einstein’s paper. In particular, he was intrigued by the new 
light it shed upon quantum degeneracy. 

In the same letter, Schrödinger makes his first reference to de Broglie’s thesis (de Broglie, 
1924a). He had been intrigued by Einstein’s reference to it in his second paper on Bose— 
Einstein statistics (Einstein, 1925) and only obtained a copy of it in late summer 1925. 
He discussed the thesis with Pieter Debye, his colleague at Zurich, who suggested that 
Schrédinger present a colloquium on de Broglie’s ideas. This duly took place in late 
November or early December 1925. According to Felix Bloch, 


“When he had finished, Debye casually remarked that he thought this way of talking was 
rather childish. As a student of Sommerfeld, he had learned that, to deal properly with 
waves, one had to have a wave equation. It sounded quite trivial and did not seem to 
make a great impression, but Schrödinger evidently thought a bit more about the idea 
afterwards.’ (Bloch, 1976) 


Just a few weeks later, Schrédinger gave another colloquium in which he is reported to have 
begun with the following words: 


‘My colleague Debye suggested that one should have a wave equation; well, I have found 


one. 


This was the discovery of Schrödinger 5 wave equation for the hydrogen atom. 
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14.3 The relativistic Schrodinger wave equation 
——————— ZZ — — — — >» 


In seeking a wave equation to describe de Broglie’s matter waves, Schrödinger began by 
attempting to find an appropriate relativistic wave equation. The reason is clear from the 
fact that, as demonstrated in Sect. 9.3, de Broglie used extensively relativistic arguments 
to find a self-consistent description of the phase relations of the matter waves and the 
kinematics of the electron. These first attempts at the derivation of the relativistic wave 
equation were never published, but the argument can be traced in Schrédinger’s notebooks 
and a three-page memorandum he wrote on the eigenvibrations of the hydrogen atom. This 
first attempt was to be superseded by his first non-relativistic paper on wave mechanics 
(Mehra and Rechenberg, 1987). 

Schrédinger was seeking a time-independent wave equation which would have the stan- 
dard form 


Av+kw=0, (14.8) 


where k = 27x /À is the wavenumber of the wave. u = w/k is the phase velocity of the wave 

and, in his dissertation, de Broglie had shown that u is the superluminal velocity u = c?/v, 

where v is the speed of the electron (Sect. 9.3). Schrödinger therefore made the following 

identification: 

pe (14.9) 
v ym.v p ymw 





In place of de Broglie’s relation E = hv = ym, Schrödinger includes the electrostatic 
potential energy so that 


2 


hv = ymo? — as (14.10) 





These expressions were rearranged to eliminate the velocity v of the electron so that 











1/2 ) 
hv + 
2 
v ll MeC : = 4T Eor (14.11) 
c e2 Mec? 
(iv F ) 
4T Eor 
Then, the phase velocity of the electron is 
h e€? 
we (14.12) 








u hv e? i ai 
1 
(2 7 a) | 


Using the relation k = 2x v/u and substituting for u into the wave equation (14.8) using 
(14.11), Schrödinger found the following wave equation, 


An*m2c? ( hv e ) l 
Ay + + -1|%=0. (14.13) 








h? Mec?  Ameomec?r 
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As we will see, this equation is of the same form as the non-relativistic wave 
equation. 

Schrédinger was now on home ground and knew exactly how to treat (14.13) as an 
eigenfunction problem. All he had to do was to follow the standard procedure of separating 
variables in spherical harmonic coordinates, namely Y (r, 0, p) = R(r) ©(0) ®($), and ap- 
ply the appropriate boundary conditions. The functions ©(@) and ®(¢@) could be expressed 
by the standard angular functions for m > 0 


@l+1)(!- m)! 
An (1+m)! 





1/2 
Yi” (0,6) = O) PH) = ( D| | P;"(cos@)exp(img), (14.14) 


where P/"(cos 0) are the standard spherical harmonics and / is a positive integer.* For values 
m < 0, the spherical harmonics are 


yE, p) =D [og], (14.15) 


where the asterisk denotes the complex conjugate of Y/"(0, @). The radial equation could 
be written as an ordinary differential equation in the following form, 


dR 2dR 2B C 
A R=0. 14.1 
PF +( + S) 0 (14.16) 





Schrédinger, after a bit of effort, solved this equation and found quantised energy levels, 
but the solution was not quite right. He was well aware of the fact that one of the triumphs 
of the old quantum theory had been Sommerfeld’s relativistic model for the hydrogen 
atom which was discussed in Sect. 5.3. The issue is illustrated by the analysis of Mehra 
and Rechenberg (1987) who show that the solutions for the constants A and B can be 
written in equivalent form in terms of the quantities A’ and B’ for both the Sommerfeld 
and Heisenberg solutions as follows: 








S feld N +Y/R-o2 
ommerteid : = = Nn, —a‘, 
hy-A 
2m B 
Schrödinger : = = Mr + (k+ 1)? —a—}., 


The constants A’ and B’ determine the energies of the stationary states of the electron in 
the hydrogen atom and Sommerfeld’s expression (5.36) was in excellent agreement with 
experiment. The extra factors of 5 in Schrédinger’s expression spoiled that agreement and 
would require the introduction of half-integral quantum numbers. This was a discourage- 
ment for Schrödinger and he set the problem aside for a brief period while he completed 
other pieces of writing. Although it was not understood at the time, what had gone wrong 
was that, as soon as a relativistic theory of the hydrogen atom is adopted, the spin of the 
electron has to be included. This was, however, a significant piece of analysis — it was the 
first time a wave equation for the electron in the atom had been introduced into quantum 
physics. Notice that de Broglie’s waves were propagating waves whereas Schrödinger had 
converted the problem into one of standing waves, like the vibrations of a violin string under 
tension. 
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14.4 Quantisation as an Eigenvalue Problem (Part 1) 
SEES 


14.4.1 Preliminaries 


Schrédinger spent the Christmas break of 1925-1926 at the Villa Herwig at Arosa where 
he intended to relax and enjoy the skiing. His mind was, however, consumed by his recent 
researches at the expense of what should have been a period of relaxation. As he noted in 
a letter to Willy Wien of 27 December 1925, 


‘At the moment I am plagued by a new atomic theory...I believe I can write down a 
vibrating system — constructed in a comparatively natural manner and not by ad hoc 
assumptions — which has as its eigenfrequencies the term frequencies of the hydrogen 
atom.’ 


His notebooks show that the first derivation of the non-relativistic wave equation for the 
hydrogen atom was formulated during that holiday. In fact, the derivation was no more than 
a straightforward simplification of the relativistic wave equation discussed in Sect. 14.3. In 
the non-relativistic case, (14.10) becomes 








2,1 2 e 
hv = mec + <MmevV” — ; (14.17) 
2 AT Eor 
and the phase velocity of the wave (14.9) becomes 
E hv 
u = — = : (14.18) 
P MeV 


As before, eliminating v between (14.17) and (14.18), a simpler expression for the phase 
velocity is obtained, 


hv hv 


MeV g 
2m. (tiv -me + ) 
4T Eor 


and so substituting into the wave equation (14.8), 











822me 
h2 








Aw + (m - mec? + —) v=0. (14.20) 


4r JT Er 


This wave equation could again be solved by separation of variables in spherical polar 
coordinates, Y (r, 0, p) = R(r) ©(0) ©(&), as in Sect. 14.3, resulting in the same form of 
ordinary differential equation for R(r), 











dR 2dR 2B C 
A R=0, 14.21 
dr? u r dr y ( = r S) ( ) 
but now with much simpler expressions for the coefficients A, Band C: 
8 ee? 
Pa eh, Bei. C=M+1), (1422 
h2 eoh? 
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where / is an integer corresponding to the azimuthal quantum number in the old quantum 
theory. The spherical harmonic solution (14.14) forces the condition that the following 
expression for A, B and C should only take integral values, n,., 


B 


Inserting the above values for A, B and C, he found 








Tz me =(n, +1) = 14.24 
a = (n, =n. (14.24) 


Sesh?(mec? — hv) 
Squaring both sides and inserting the values of A and B, we find 
Roh A 
— ~ =, 14.25 
mec? — hv . ( ) 
which can be reorganised as follows: 


Roh 
E = hv = m — =; 





(14.26) 


n 


where R is the Rydberg constant, Rx = m.et / 8egh’ c. This was a startling result — the 
energies of the standing waves were exactly those of the stationary states of the hydrogen 
atom and the differences in energies between them would result in Bohr’s formula for the 
lines observed in the hydrogen spectrum. 

Schrédinger still had a lot to do. In particular, he still had to solve (14.21) for the radial 
dependence of y upon radius. He had with him Schlesinger’s textbook Introduction to the 
Theory of Differential Equations (1900) and struggled to find the solution. In fact, Courant 
and Hilbert’s Methods of Mathematical Physics, Volume 1 had appeared in 1924 and gave 
a solution to a similar equation in terms of Laguerre polynomials — the actual solution is 
obtained using associated Laguerre polynomials, which appear in Part 3 of Schrödinger’s 
papers of 1926 with the same title Quantisation as an eigenvalue problem (Schrödinger, 
19266). 


14.4.2 The non-relativistic theory of the hydrogen atom 


Within a few weeks, Schrödinger completed the first of his great series of six papers and it 
was received by the Annalen der Physik on 27 January 1926. The ideas and approach were 
refined with the striking feature that he devotes most of the arguments to the mathematical 
requirements of the wave equation he had discovered, rather than relating the formalism 
to the physics of the hydrogen atom, or quantum phenomena in general. He explains the 
reason for this approach in the third section of his paper. 


‘It is, of course, strongly suggested that we should try to connect the function w with 
some vibration process in the atom, which would more nearly approach reality than the 
electronic orbits, the real existence of which is being very much questioned today. I 
originally intended to found the new quantum conditions in this more intuitive manner, 
but finally gave them the above neutral mathematical form, because it brings more clearly 
to light what is really essential. The essential thing seems to me to be, that the postulation 
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of “whole numbers” no longer enters into the quantum rules mysteriously, but that we 
have traced the matter a step further back, and found the “integralness” to have its origin 
in the finiteness and single-valuedness of a certain space function.’ 


Schrödinger was to defer the more physical arguments to the second paper of the series. 
His first step was to rederive the wave equation (14.20) in a somewhat more formal 
manner, starting from the Hamilton-Jacobi differential equation (5.86), 


a 
H (a z5) =E, (14.27) 
ðq 


which was derived in Sect. 5.4.4. He points out that the solutions for S are normally found 
in terms of the sum of functions, each being a function of only one independent variable 
qi. This was illustrated in the case of Sommerfeld’s analysis of the elliptical orbits in 
the old quantum theory in Sect. 5.5. Rewriting (5.98) in our present notation in Cartesian 


coordinates, 
1 | OSS". fasy? (asy Ze? 
H=-—- II pains mae = =E. 14.2 
2m (<3) (£) +(5) l 4T Eor ( 8) 


Now, in his analysis of the wave equation (14.20), Schrödinger had shown that the solution 
should be sought in terms of the products of independent functions as illustrated by the 
separation of variables which led to (14.21). In contrast, the analysis of Sect. 5.5 showed that 
the solution is written as the sum of independent functions of the orthogonal coordinates 
r, 0, &. In Cartesian notation, (5.99) would be written 





S = Sx(x) + SO) + EL). (14.29) 


The solution is to write S as the logarithm of some function y such that S = K In y. Then, 
(14.27) can be written 


Hl(g,——-)=E. 14.30 
(0. Fan) a 


He states that he is to apply this formalism to the non-relativistic case of the hydrogen atom. 
Then comes the key to the new approach: 


“We now seek the function w, such that for any arbitrary variation of it the integral of 
the said quadratic form, taken over the whole of coordinate space, is stationary, y being 
everywhere real, single-valued, finite, and continuously differentiable up to second order. 
The quantum conditions are replaced by this variation problem, 


Now, (14.28) can be rewritten 


OK 1 ay)? law\? law\? Ze 
E lc x) + 7 H x) ma e Ae 


or, converting this into an appropriate form for finding stationary solutions for w, 


(IPN? (IVN? (IYN? 2m Ze N 3 
(5) +(#) +(#) - (Er 2—)v =0. (14.32) 
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This is the quadratic form which is to be subject to arbitrary variations over the whole of 
coordinate space. The stationary solution is found by requiring ôJ = 0, that is, 


2 2 2 2 
u fff eve (lee) 


where the integral is taken over all space. Schrödinger then states that 





‘From this we find in the usual way,’ 


2 
w= 4 (25) ôy dA - ff dx dy dz ôy vv + (E+ rw] =0, 
(14.34) 





where dn is the differential element of distance perpendicular to the element of area dA. 
We outline in the endnotes how this expression is derived using the tools developed in 
Sect. 5.4.2.° For arbitrary variations ôy, both terms must be zero and so it follows that 





2 2m e? 
vyti (E+ )v=0, (14.35) 


4T Eor 


with the requirement that the first integral of (14.34) should be zero when taken over 47 


steradians at infinity, 
ð 
$ (*) sydd =0. (14.36) 
s on 


Schrödinger had already derived an equation of similar form starting from his interpreta- 
tion of de Broglie’s concept of matter waves which resulted in (14.20). Comparing (14.20) 
and (14.35), it immediately follows that K = h/2z. Therefore, 





2 827m ( e? ) 
Vow + E+ w=0. (14.37) 
h? 4T Eor 
This is the definitive version of Schrödinger 5 time-independent wave equation for the 
hydrogen atom.° Most of the rest of the paper is concerned with a careful analysis of the 

mathematical properties of the wave equation (14.37). 

First of all, Schrédinger recognised that, because of the spherical symmetry of the 
problem, spherical polar coordinates were the natural system to adopt. He sought solutions in 
these coordinates by separation of variables, Yy = R(r) O(@) ®($), the functions Y(@, d) = 
©(0) (p) being the usual spherical harmonics given by (14.14) and (14.15). The key 
feature of these functions is that, in order that the spherical harmonics are single-valued, 
l can only take positive integral values / > 0 while m can only take integral values in the 
range —/ < m < l. The spherical harmonics form a complete, orthogonal set of functions 
so that any distribution on a sphere can be decomposed into the sum of spherical harmonics. 

The separation of variables results in the following equation for R(r), 





(14.38) 


ER 2dR [8mm E  8x2me2 (+1) 
+ + R=0, 
d? rdr h? h?r r? 
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where/ = 0, 1, 2, 3... Schrödinger now carries out a detailed analysis of the properties of 
acceptable solutions of this equation, paying particular attention to the issues of singularities 
atr = 0 andr = œ. The solutions are found in terms of Laplace transformations and are 
characterised by the quantity 


mee? 


\/ —8e,h2m-E 


He then establishes the following results — the italics are Schrödinger’s: 


(14.39) 


1. [(14.38)] has, for every positive E, solutions which are everywhere single-valued, finite, 
and continuous; and which tend to zero as \/r at infinity under continual oscillations. 

2. For negative values of E which do not satisfy the condition [that (14.39) takes integral 
values] our variation problem has no solution. 

3. For negative values of E, our variation problem has solutions, if and only if, E satisfies 
the condition 


Mee” 


/ —8¢, 2m. E 
where n = 1,2,3,4,... 


4. There are no solutions of the problem ifl > n. Only values smaller than n (and there 


=n, (14.40) 


is always one such at our disposal) can be given to the integer l, which denotes the 
order of the surface harmonic appearing in the equation. 


The energies of the stationary states can now be found from (14.40) and are 


meet 


En, = genta . (14.41) 
These are precisely the energies of the stationary states of the electron in the hydrogen 
atom. 

Schrödinger fully appreciated the significance of these calculations. Immediately after 
his derivation of the energies of the stationary states of the hydrogen atom, he explains the 
significance of the quantum numbers — I have translated the quantum numbers into modern 
usage. 


‘Our n is the principal quantum number. / + 1 is analogous to the azimuthal quantum 
number. The splitting up of this number through a closer definition of the surface harmonic 
can be compared with the resolution of the azimuthal quantum number into an “equatorial” 
and “polar” quantum. These numbers here define the system of nodes-lines on the sphere. 
Also the “radial quantum number”  —/ — 1 gives exactly the number of the “node- 
spheres”, for it is easily established that the function [R(r)] has exactly n — l — 1 positive 
real roots. The positive E-values correspond to the continuum of hyperbolic orbits, to 
which one may ascribe, in a certain sense, the radial quantum number oo.’ 


He also notes that the amplitudes of the standing wave solutions for R(r) tend to zero at 
radii greater than a,/n, where a, is the corresponding Bohr-Sommerfeld value for the 
semi-major axis of the classical elliptical orbits. 
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14.4.3 Reflections 


Schrédinger realised the importance of these calculations in providing insight into the 
nature of quantisation. Close to the end of his paper, he writes: 


‘It is hardly necessary to emphasise how much more congenial it would be to imagine 
that at a quantum transition the energy changes from one form of vibration to another, 
than to think of a jumping electron. The changing of the vibration form can take place 
continuously in space and time, and it can readily last as long as the emission process 
lasts empirically.’ 


He had made the remarkable discovery that quantisation can be introduced into the de- 
scription of quantum phenomena through the essential requirement that the wavefunction 
w should be everywhere real, single-valued, finite, and continuously differentiable up to 
second order. As he emphasised, the somewhat arbitrary quantum conditions of the old 
quantum theory are replaced by these constraints on the properties of the wavefunction. 

The paper gives the impression of being written in the white heat of inspiration — 
Schrédinger realises that there are many loose ends to be tied up, but he was driven to 
publish his discovery in an almost unvarnished form. He fully acknowledges the importance 
of de Broglie’s papers: 


“Above all, I wish to mention that I was led to these deliberations in the first place by the 
suggestive papers of M. Louis de Broglie (de Broglie, 1924a),...’ 


It is perfectly understandable that he should give the highest priority to publishing his 
wave equation, without going into a more thorough examination of his intellectual break- 
through. Still working at breakneck speed, his next paper Quantisation as an eigenvalue 
problem (Part 2) was received by the Annalen der Physik on 23 February 1926. Part 2 might 
more logically have preceded Part 1 since it provides a deeper insight into how the wave 
equation can be formulated as an extension of the principles of Fermat and Hamilton. The 
analysis of that paper is our next task. 


14.5 Quantisation as an eigenvalue problem (Part 2) 
EE 


Schrédinger’s second paper was concerned with providing a more formal basis for his 
wave equation and providing further examples of its application (Schrédinger, 1926c). 
The attempt to place Lagrangian mechanics and physical optics on the same formal basis 
had been pioneered by Hamilton, inspired by Lagrange’s approach to classical mechanics 
and Fresnel’s success in developing the wave theory of optics to describe diffraction and 
interference phenomena. Hamilton’s objective was no less than to provide a single unified 
theory which would describe both the motion of particles and the propagation of light rays 
(Hamilton, 1833). He achieved this by showing that his characteristic function S, introduced 
in Sect.5.4, could be applied to both the dynamics of particles and the paths of light 
rays. 
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14.5.1 The fundamentals of undulatory mechanics 


For light rays, Fermat 5 principle of least time states that the path of a light ray is that which 
minimises the time between the source of light and its detection through a medium in which 
the refractive index n varies with position 


af "dh. (14.42) 
Cc 


where c is the speed of light and d/ is the element of distance. Snell’s law of refraction, 
nı sind, = nz sin 42, follows immediately from the geometrical application of the principle, 
where nı and nz are the refractive indices of the medium and the angles 6, and 6 are the 
angles of incidence and refraction of the ray relative to the normal to the surface. 

The analogue of Fermat’s principle for mechanical systems can be readily derived from 
Maupertuis’ principle, historically the first of the variational principles, and a variant of the 
principle of least action. In the present instance, we are only interested in cases in which the 
time does not appear explicitly so that neither the Lagrangian nor the Hamiltonian depend 
upon time, in other words, the systems involve conservative fields of force. In terms of 
general coordinates, the Hamiltonian H(p, q) = E = constant. 

Landau and Lifshitz (1976) demonstrate that, under these conditions, Maupertuis’ prin- 
ciple can be written 


5Sy =0 where Sy = f X pidq:, (14.43) 


where p; and q; are the generalised coordinates introduced in Sect. 5.4.3, with p; = d£/0qj. 
Schrödinger uses the principle in its simplest form in which p; is the Cartesian momentum 
p = mv andq the position coordinate. Then the trajectory of a particle is given by that path 
for which the minimisation condition is 


where T is the kinetic energy.’ Expression (14.44) was to be the starting point for 
Schrödinger’s next attack on the fundamentals of what he called in this paper undula- 
tory mechanics. 

Starting from Maupertuis’ principle (14.44), dt can be replaced by d//v where v is 
the velocity of the particle. Since the kinetic energy of the particle is sm v?, Maupertuis’ 
principle becomes 


ô fm d =0. (14.45) 


But, we can write v in terms of the energy E and the potential energy U as sm v+U=E 
and so the minimisation procedure becomes 


af [2m(E - U)? d=0. (14.46) 


According to Hamilton, the minimisation procedure (14.46) should be the same as Fermat’s 
principle of least time (14.42). Therefore, the equivalence between the phase velocity v of 
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the wave and the motion of a particle can be written symbolically 


c N C 
[2m(E - U)]'? ’ 





(14.47) 


Uwave = 


where C is a constant. Jammer (1989) refers to this equivalence as Hamilton s optical- 
mechanical analogy. 

Hamilton took the analogy further. Hamilton’s action function is defined to be S = 
S L dt, where £ = (T — U)is the Lagrangian, and defines an action surface S(x, y, Z, t) = 
constant. The properties of the action surface are directly related to the dynamics of the 
particles of the system. From the action function, the following relations can be derived, as 
demonstrated in the endnotes. The momentum and total energy of the particle are given 
by 


as 
p=VS_ and = L—-—pv=-E, (14.48) 





where £ is the constant total energy for motion under conservative forces. The corre- 
sponding equations for the wavefront of a wave of the form exp(id) = exp [i(k - r — wr)] 
are 


k=V$ and = =-o, (14.49) 


where & is the phase factor of the wave and w = 27 v. Comparing (14.48) and (14.49), 
the surfaces of constant action of a system of particles are the exact analogues of the 
surfaces of constant phase for optical waves. In addition, the wavevector k is the analogue 
of the momentum p and the angular frequency w the analogue of the energy E of the 
particle. These insights, published by Hamilton between 1828 and 1837 (Hamilton, 1931), 
were far ahead of their time. The formal analogy was apparent, but there was no obvious 
interpretation of the velocity v as a wave velocity. With the exception of few theorists, such 
as Felix Klein, Hamilton’s optical-mechanical analogy was neglected until Schrédinger 
brought the concept to the forefront of quantum theory in 1926. 

In his second paper, Schrödinger emphasises that Fermat’s principle in optics and Hamil- 
ton’s principle in mechanics provide the classical equivalence of ray optics and particle 
mechanics. Fermat’s principle makes no reference to the wave nature of light. In contrast, 
the variational procedures for wave motion were deeply embedded in Fresnel’s approach 
to wave optics and so there should be a parallel wave equivalence for particle motion. 
Classical ray optics works extremely well, provided the scale of the system is large com- 
pared with the wavelength of light. It breaks down, however, when the wavelength is of the 
order of scale of the system, giving rise to the characteristic diffraction and interference 
phenomena. Schrödinger postulates that the same should be true for particle mechanics. 
Classical mechanics works very well, provided the scale of the system is very much greater 
than the typical scale over which quantum phenomena are observed. On the atomic scale, 
however, the wave properties of matter cannot be neglected if the scale of the system is of 
the same order as the de Broglie wavelength, à = h/p, where p is the momentum of the 
particle. 
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Word W 
W, 
HI 


Fie. 1. 


Schrödinger’s visualisation of the motion of the action surface defined by $ = A L dt, where £ = T — VU isthe 
Lagrangian (Schrödinger, 1926c). 


Schrödinger first discusses the dynamics of the particle in terms of the motion of the 
action surface and shows that its normal velocity is 


E 
v= Z = (14.50) 


/2m(E — U) 
His pictorial representation of the motion of the action surface is shown in Fig. 14.1. We 
can derive this relation directly from the equivalences discovered by Hamilton between 
the velocity of the action surface and that of a wavefront, (14.49) and (14.50) respectively. 
Thus, the phase velocity of the wave or action surface is 


ag/at _ ƏS/ E _ E ka 


w 
w=[el- az 
A| Vo VS p 2m(E — U) 
The problem which faced Hamilton and the nineteenth century theorists was that (14.50) 
is not the velocity of the particle which is 


v?m(E =) (14.52) 


Upart = : 
p m 





This was the principal reason why Hamilton’s insights were neglected. 

Schrödinger realised, however, that, according to undulatory mechanics, the expression 
(14.51) corresponded to a dispersion relation for standing waves associated with the sta- 
tionary states of the electron in the atom. The energy of the stationary state is E = hw and 
so (14.52) becomes 





w hw 1 
7 = mo 0 a <— gv mfo- U). (14.53) 


Hence the group velocity vg, = dw/dk, which is the velocity of a wave-packet, is given by 


dk 1 m l /2m(E —U) er 


= = 7 Ver = ` 
dw Ver y2m(ho = U) m 
This is precisely the expression for the velocity of the particle. Note also that vphVor = 
E/m. 
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This reasoning is similar to de Broglie’s arguments which were developed in detail in 
Sect. 9.3. In particular, the importance of distinguishing the de Broglie waves which move 
with the phase velocity and the particle velocity which travels at the group velocity of 
the superposition of these waves is demonstrated in the expressions (9.15) and (9.16). 
Schrödinger explicitly recognises that this is precisely the insight provided by de Broglie, 
writing 


“We find here again a theorem for the “phase waves” of the electron, which M. de Broglie 
had derived, with essential reference to the relativity theory, in those fine researches [(de 
Broglie, 1924b)] to which I owe the inspiration for this work.’ 


Schödinger devotes considerable attention and caution to the formal underpinning of 
undulatory mechanics according to the precepts of Hamilton—Jacobi theory and then adopts 
the simplest means of formulating the wave equation. The standard wave equation for y 
can be written 


1. 
Vy- ý=. (14.55) 
Urn 
Time-independent solutions are sought in which the wavefunction y depends upon time as 
exp(iwt) = exp(i2rr vt) and so 


Vwt aed =0. (14.56) 


Now substituting (14.52) for vpn and recalling that E = hv, we find 


87 822m 


h h? 


These equations are exactly the same as the wave equation (14.37) derived in his first paper. 
Schrödinger is duly cautious about the uniqueness ofthis equation, but argues on grounds of 
simplicity that it will be adopted as the wave equation for describing quantum phenomena. 
He also notes that, as a partial differential equation, very large numbers of solutions are 
possible. Although the quantum relation Æ = hv has been introduced, the quantisation 
of the energy levels of atomic systems is not bolted on arbitrarily, but is determined by 
the boundary conditions needed to satisfy the requirements that the function y must be 
single-valued, finite and continuous throughout configuration space. He had applied these 
concepts in Part 1 and now he gives further examples of the power of these methods. 


Vy + M E-U =0 or Vy + 
2 oe ’ 








(hv-U)y=0. (14.57) 


14.5.2 Applications 


The solution of Schrödinger’s wave equation for the hydrogen atom was a truly impressive 
feat and now he extended the concepts to the problems which had already been treated with 
success by Born, Heisenberg and Jordan (1926). By the time he was working on Part 2 
of his series of papers, he had discovered Courant and Hilbert’s Methods of Mathematical 
Physics (1924) and was bowled over by its contents. In an enthusiastic letter to Wien of 22 
February 1926, he writes: 
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“Time is flying. Each second or third day brings with it a small novelty — it works, not 
I, and that “It” is the magnificent classical mathematics and Hilbert’s mathematics, the 
wonderful edifice of eigenvalues. These unfold everything so clearly before us, that all 
we have to do is take it, without any labour and bothering; because the correct method 
is provided in time, as soon as one needs it, completely automatically. I am so happy 
to have escaped from terrible mechanics, including its action and angle variables and 
the perturbation theory, which I have never really understood. Now everything becomes 
linear, everything can be superposed; one computes as easily and comfortably as in good 
old acoustics. Even perturbation theory [in the new mechanics] is not more complicated 
than [considering] the forced vibrations of a string.’ 


This remarkable coincidence of the appearance of Courant and Hilbert’s book and the 
immediate application of eigenfunctions and eigenvalues to wave mechanics was one of 
the reasons for Schrödinger’s amazingly rapid progress in 1926. 


The Planck oscillator 


The first new application was to the harmonic oscillator, referred to as the Planck oscillator 
because of its fundamental role in Planck’s pioneering paper of 1900. Writing the kinetic 
and potential energy terms for the one-dimensional harmonic oscillator as T = imi? and 
U = $mw2x?, the wave equation becomes 





Cy 822m 
aot ge (E — imax’) y =0. (14.58) 
Changing the notation to a = 8’mE/h? and b = 4Ar’m?w2/ h?, this equation becomes 
dy 
L~ (a — bx*)w =0. (14.59) 
Changing variables to y = xb'/*, the equation is reduced to a standard form 
y a ’) 
—— — — =0. 14.60 
ap + ( m T y ( ) 


Eigenfunction solutions for this equation had been presented in Courant and Hilbert’s 
Methods of Mathematical Physics (1924). The eigenvalues can only take the values a / v/b = 


1, 3, 5,...,(2n + 1), ... and the eigenfunctions are 
YO) =e"? HO), (14.61) 
where H,,(y) are the orthogonal Hermite polynomials, the first few of which are 
Ay(y) = 1 HiQ) = 2y 
Ha) = 4y* — 2 HO) = 8y? — 12y 


Hy(y) = 16y* — 48y? + 12 
From the definitions of a and b, the eigenvalues are therefore 


MNE E- (2n — 1) (14.62) 
—= —=1,9),),...(ZN— para g F 
Jb hvo 
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that is, 
E = (n + $) hv = (n + 4) ñoo . (14.63) 


This striking relation is identical to Heisenberg’s results obtained from the matrix mechanics 
approach to quantum phenomena. Schrödinger was also able to derive the forms of the 
eigenfunctions, to which we will return in the next section. He was well aware of the 
fact that the solution included the zero point energy, hvo, and that it is observed in 
measurements of the frequencies of band edges (see Sects. 11.5 and 12.5). 


Rotator with a fixed axis 


In this case, the energy is entirely in the kinetic energy of rotation of the rotator and the only 
variable is the phase & of the angle of rotation. For this rotational motion, the equivalence 
with the linear motion of the electron in a harmonic oscillator is 


im = ilọ, m=I, x=. (14.64) 
Therefore, the wave equation becomes 
ey 870171 
— +— Ey =0. 14.65 
“et (14.65) 
This is a simple harmonic equation with solution 
sin 8021 \ 1 
y) = ~z] l. (14.66) 
cos h 
The quantisation arises from the requirement that the wavefunction be single-valued and 
continuous and so (827//h*)!/? = 1, 2, 3,..., and the quantised energy levels of the 
rotator are 
n?h? 
n= —, 14.67 
87721 ( ) 


in agreement with previous quantum arguments. 


Rigid rotator with free axes 


In this case, the motion of the rotator is constrained such that the ends of any diameter 
lie on the surface of a sphere. Therefore, the wave equation has to be written in spherical 
polar coordinates. The kinetic energy in the 0 and & directions is written in terms of the 
angular momentum about the ig and ig directions. Since T = $m(vj + vj) and the angular 
momenta about these axes are Lg = mrvg and Lg = mr sin vg, 


i SL 
T= 04 a) (14.68) 


Therefore, the wave equation becomes 


1 9a /. əy 1 Oy 8n°IE 
- sind : + 
sind 90 90 sin? 9 962 h? 





Vy = 








v=0, (14.69) 
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where there is no radial dependence since the circumference of the rotator is constrained 
to lie on the surface of a fixed sphere — the Laplacian operator can only depend upon the 
polar angles @ and &. The solution is sought by the usual procedure of separating variables, 
y = 0(0) ®(¢@), and then the quantisation condition enters from the requirement that the 
functions ©(@) and ®(@) are continuous and single valued on the surface of the sphere. 
These conditions lead to the requirement that only discrete values of the angular quantum 
numbers are possible so that the energy eigenvalues are 


TE I(l + 1h? 

8221 
where/ = 0, 1, 2, 3,... Notice that we are using modern standard notation for the angular 
momentum quantum number /, rather than Schrédinger’s n. This expression will be recog- 
nised as the standard expression for the quantised energy levels of, for example, a diatomic 
molecule.’ 


(14.70) 


The non-rigid rotator — the diatomic molecule 


In the final part of his paper, Schrödinger tackles the problem of the rotating and vibrating 
molecule. The problem now involves six degrees of freedom as well as harmonic coupling 
between the two atoms of the diatomic molecule. We simply quote Schrödinger’s final result 
for the quantised energy levels: 





Il + DR? € i 
E=E 1 DAvov1 14.71 
A ( a) tory voV 1 +3€ , (14.71) 


where n = 0, 1, 2,... and? = 0, 1, 2,... The small quantity 


d+ 1)h? 


€ = ——— 14.72 
16244 1? ( ) 


is the ratio of the rotational to vibrational energy of the molecule. Eş is the molecule’s 
translational energy. The second and third terms correspond to the quantised rotational 
and vibrational energies and are the familiar terms with small corrections represented by 
the small quantity e for the coupling between the modes. Schrödinger recognised that this 
calculation does not take account of the important deviations from a harmonic potential 
for a more realistic model of interatomic forces. For this, perturbation solutions of the 
wave equation are needed and the appropriate procedures were to be developed in Part 3 of 
Schrédinger’s series of papers. 


14.6 Wave-packets 


Schrédinger’s third paper, published on 9 July 1926, was a short note to Die Naturwis- 
senschaften concerning the representation of an oscillator in undulatory mechanics by a 
wave-packet, defined by the superposition of the eigenfunctions of the harmonic oscillator 
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(a) The first five eigenfunctions of a harmonic oscillator Yp = er /2 H, (x). The functions decay exponentially outside 
the range —3 < x < 3 shown in the diagram. (b) The wave-packet associated with the Planck, or harmonic, 
oscillator as represented by the superposition of the wavefunctions y(x) (Schrodinger, 1926a). 


(Schrédinger, 1926a). He summarises the results of the calculations for the wavefunctions 
of the harmonic oscillator as follows: 


Un =e? Hy (y) ele", (14.73) 


where w, = (n + 4) wo and we have used the notation of Sect. 14.5.2. H,(y) are the Hermite 
polynomials and y = x, /2rmwo/ h. The functions (14.73) are normalised by multiplying 
by (2”n!)~!/? and these are referred to as Hermite’s orthogonal functions — the first five 
normalised wavefunctions are displayed in the range —3 < y < +3 in Fig. 14.2a, which is 
taken from Schrédinger’s paper. Outside this range the wavefunctions decrease exponen- 
tially to zero. 

To create a wave-packet, Schrödinger adopts the complete set of eigenfunctions and 
assumes that they are of large amplitude A >> 1. Then, he chooses the wave-packet to be 
described by the following function 


00 n 00 n 
y AV Wn io >> A iot l= 2/2 
y = r (5) ei; =e? r 2° 2 at y H, (y) . (14.74) 


The significance of (14.74) is that Hermite’s orthogonal functions (2"n!)~'/?, are weighted 
by the factor A” //2”n!. The latter function can be compared with the function z” /n! which, 
for large values of n has a sharp maximum at n = z. Thus, by weighting the eigenfunctions 
with this factor, a narrow range of values of n is selected about the value n = A/2. This 
choice of weighting also has the advantage that the series (14.74) can be summed exactly 
since 


© n 2 
Ye"? 4,0) = exp (-# + 2sy — z) , (14.75) 
n: 


n=0 2 
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Hence, 


A? . y? 
w(y, t) = exp (over? - Ze - Z) l (14.76) 


Taking the real part of (14.77), the evolution of the wave-packet is 
AP 4 2 A A 
w = exp a2 y — Acos wot) | cos | wot/2 + (A sin wot) - | y — 7 coswot ||. 
(14.77) 


This is the remarkable final result of Schrödinger’s calculation, one which was to usher in 
a new approach to understanding the nature of physics at the quantum level. The solution 
is illustrated in Fig. 14.25. Schrödinger explains carefully the significance of each term in 
(14.77). The first term in large square brackets is a Gaussian error-curve which is centred 
at the position y = A cos wot. The width of the distribution is of order unity and so is small 
compared with the amplitude of the oscillation of the error-curve along the y-axis. If we 
now revert to the x-coordinate, the amplitude a of the oscillation is 


a= AyYh/2nma , (14.78) 


and so the classical energy of the oscillator, if the particle is assumed to have mass m, 
is 


2 
Evis = 3050°m = Shey = nhon , (14.79) 


exactly the mean energy of the oscillator which has the average quantum number n of 
the group. Thus, the wave-packet represents the oscillation of a particle of mass m in the 
quantum state n. 

The second term in large square brackets in (14.78) represents the modulation of the 
Gaussian error-curve by the ‘carrier’ signal, shown in Fig. 14.25. The wave-packet oscillates 
back and forward about y = 0, exactly as in the case of a harmonic oscillator. Schrédinger 
notes the important point that, unlike normal wave-packets in which the waveform is 
eventually broadened because of dispersion, there is no dispersion of the wave-packet 
according to (14.77). We recall that the sum is an exact solution of the problem of the 
oscillator and so there are no higher order corrections in the case of the Planck oscillator. 
Another elementary example of the non-dispersive propagation of a wave-packet in wave 
mechanics occurs in the case of the representation of a particle moving at constant speed 
v, as illustrated in the endnote.!° 

The solution (14.78) for half an oscillation cycle is shown in Fig. 14.3 for the case 
A = 20. This diagram illustrates how, for large quantum numbers n, the behaviour of 
the wave-packet exactly mimics that of an oscillator of mass m. In this representation, 
time t = 0 corresponds to the maximum value of x which occurs at cos wot = 1, so that 
sin wot = 0. In this case, the second term in the second square bracket is zero and there 
are no ‘beats’ in the profile of the wave-packet. On the other hand, when cos wot = 0 and 
sin wot = 1, the profile is modulated by the function cos Ax = cos 20x. The overall mean 
locus of the wave-packet in Fig. 14.3 is half a cosine wave centred on t = 0. In complete 
analogy with classical mechanics, Schrödinger remarks that the 
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The evolution of the form of a wave-packet with A = 20 over half a period of oscillation of a quantised harmonic 
oscillator according to (14.77) (Schrédinger, 1926a). The diagram was kindly created by Dr. David Green. 


‘variability of the ‘corrugations’ is to be conceived as depending on velocity, and, as such, 
is completely intelligible from all general aspects of undulatory mechanics — but I do not 
wish to discuss this further at present.’ 


14.7 Quantisation as an eigenvalue problem (Part 3) 
E) 


Having discovered Courant and Hilbert’s Methods of Mathematical Physics, Schrödinger 
redoubled his efforts. The next task was to develop perturbation theory, a major goal being 
the application of these techniques to account for the Stark broadening of the Balmer lines 
of hydrogen. This had been one of the great triumphs of the old quantum theory, thanks to 
the work of Epstein (1916a,b) and Schwarzschild (1916). 
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14.7 Quantisation as an eigenvalue problem (Part 3) 


14.7.1 Courant and Hilbert (1924) 


Courant and Hilbert’s remarkable textbook on the Methods of Mathematical Physics pro- 
vided a complete summation of all the tools Schrédinger needed to find solutions of his 
wave equation. The titles of the chapters themselves provide an indication of how appro- 
priate the mathematical methods were for the solution of wave equations — ‘III Linear 
Integral Equations, IV The Calculus of Variations, V Vibration and Eigenvalue Problems, 
VI Application of the Calculus of Variations to Eigenvalue Problems, VII Special Func- 
tions Defined by Eigenvalue Problems’. For many purposes, Schrédinger simply had to 
translate the language of Courant and Hilbert into what he now called wave mechanics. As 
he remarked in the introduction of this paper, Part 3 of the series, 


‘The method is essentially the same as that used by Lord Rayleigh in investigating the 
vibrations of a string with small inhomogeneities in his Theory of Sound [(Strutt (Lord 
Rayleigh), 1894)]. This was a particularly simple case, as the differential equation of 
the unperturbed problem had constant coefficients, and only the perturbing terms were 
arbitrary functions along the string. A complete generalisation is possible not merely with 
regard to these points, but also for the specially important case of several independent 
variables, i.e. for partial differential equations, in which multiple [eigenvalues] appear in 
the unperturbed problem, and where the addition of a perturbing term causes the splitting 
up of such values and is of the greatest interest in well-known spectroscopic questions 
(Zeeman effect, Stark effect, Multiplicities).’ 


The key mathematical tools were the use of Hermitian operators and the techniques 
developed by Sturm and Liouville to treat perturbation solutions of particular types of 
second-order differential equations. The concepts of eigenfunctions and eigenvalues were 
already well known, as indicated by the reference to Rayleigh’s analyses in the context 
of sound waves, but now they had to be applied to Schrédinger’s wave equation. The key 
mathematical features of the eigenfunctions involved in quantum mechanics are described 
in textbooks on quantum mechanics and are as follows: 


1. If L is a differential operator, the eigenvalue equation is 
Lu(x)=aAu(x). (14.80) 


The region Q over which the problem is to be solved has to be prescribed and appropriate 
boundary conditions adopted. The key feature of the solutions needed for quantum 
mechanics is that L should be Hermitian, meaning that 


[vcore =| f væ) Lu a| , (14.81) 
Q Q 


where the asterisk means the complex conjugate and u and v are arbitrary functions 
which satisfy the boundary conditions. The Hermitian behaviour of the wavefunctions 
is exactly equivalent to the properties of Hermitian matrices, discussed in Sect. 12.3. In 
the case of matrices, the process of determining the eigenvalues is part of the operation 
of converting the matrices into diagonal form. 
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2. If L is a Hermitian operator, the eigenvalues are real, just as in the case of Hermitian 
matrices. 

3. The eigenfunctions of a Hermitian differential operator, associated with different eigen- 
values, are orthogonal, meaning that 


uv = f u*(x)v(x)@x=0, ifu#v. (14.82) 
Q 


4. Under very general conditions, the set of eigenfunction solutions forms a complete 
orthogonal, normalised set, or orthonormal set, so that, just as in the case of Fourier 
series, any well behaved function of x can be synthesised by an infinite series of these 
eigenfunctions. Thus, 


=) cru). (14.83) 


n 


Since the u, are orthonormal, the coefficients c, can be readily found, 


Um f = 3 Cn Um‘ Un = oc Ômn = Cm . (14.84) 


A differential operator of special importance is that associated with the Sturm—Liouville 
equation. The operator is 





dy dpdy d dy 
= E | qy » (14.85) 


LO}=P aot ade © ae le ae 
where y = y(x) is the dependent function, p = p(x), dp/dx and q = q(x) are continuous 
functions of x and p > 0. If L(y) = 0, this becomes a linear, homogeneous, second-order 
differential equation and it can be readily shown to be Hermitian. Generalising for some 
arbitrary weighting function p(x) which is a continuous function of x and which never 
becomes negative or zero, the set of eigenfunctions found by setting L(y) = Ep(x)y(x) 
and satisfying the boundary conditions is complete and orthonormal. The solutions of the 
homogeneous equation result in the eigenfunctions u; and eigenvalues E; for the stationary 
states in, for example, the hydrogen atom. Note that, with the inclusion of the weighting 
function, the normalisation condition for the eigenfunctions is 

1 fi=j 
[ecoucoutydr =a) = 40 12, (14.86) 
0 ifi Fs. 

The next step is to find the solutions when the system is subject to a small perturbation. We 
begin by assuming that the unperturbed solution for the eigenfunctions u and eigenvalues 
E, are known and then ask how these change when a small perturbing term —Ar(x)y is 
added to the wave equation so that it become 


L[y] — Ar(x)y + Ep(x)yv(x) = 0. (14.87) 


A is assumed to be a small quantity and r(x) is an arbitrary continuous function of x. It is 
therefore expected that the solutions will only result in small changes of the coefficients q 
in (14.85). The continuity aspect of the solutions was a key feature for Schrédinger. In the 
introduction to the paper, he writes 
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‘[The perturbation method] is based upon the important property of continuity possessed 
by [eigenvalues] and [eigenfunctions], principally, for our purpose, upon their continuous 
dependence on the coefficients of the differential equation, and less upon the extent of the 
domain since in our case the domain... and... the boundary conditions... are generally 
the same for the unperturbed and perturbed problems.’ 


Again, Courant and Hilbert provide the complete solution. For small values of A, the 
energies and eigenfunctions of the perturbed eigenstates are slightly different from their 
unperturbed values and so Schrédinger writes, 


E% = Ek + ek; u% = u(x) + Avg(x). (14.88) 


Substituting these relations into (14.87) and recalling that the unperturbed eigenfunction 
for u, satisfies the unperturbed wave equation, we find 


L[v] = Exovug = (r — ekp) ur - (14.89) 


This is now an inhomogeneous equation for v. Courant and Hilbert showed that the eigen- 
function equation for v only has solutions if the right-hand side of (14.89) is orthogonal 
to the associated solution of the homogeneous equation. Therefore, 


fru dx 
(r—Ep)uzdx =0, 4 = =. (14.90) 
J i J puz dx 
If the functions u; have been normalised, then 
Ek = [ra dx. (14.91) 


Thus, the perturbed energy of the eigenstate k has been found, without determining the 
perturbed eigenfunction v+. This result is the exact equivalent of that found in classical 
mechanics, namely, that the energy perturbation is in the first approximation equal to the 
perturbing function averaged over the unperturbed motion. 

Finally, to find the function vx, the inhomogeneous equation is solved in terms of the 
complete set of eigenfunctions u;(x), 


oo 
vr) = D> yuu), (14.92) 
i=l 
with the result that 
ki u f[ruu; dx 
 B-E E-E 





Yii ifi fj. (14.93) 


In this expression, 


[ruudx fori # j 


14.94 
0 fori=k. ( ) 


Cik = fe — Erp)uru; dx = | 


Hence, the perturbed eigenfunction and energy eigenvalue of the perturbed state are 


ui) f[ruru; dx 


oo 
ux(x) = u(x) +), ran Ei = But f ruas, (14.95) 
i=1 a 
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where the dash on the summation means that the term i = j should be omitted from the 
sum. 

Following his careful exposition of the fundamentals of perturbation theory, Schrédinger 
extends the procedures to several independent variables, resulting in a set of partial, rather 
than ordinary, differential equations. 


14.7.2 The Stark effect 


Schrödinger immediately applies the new formalism to the Stark effect, in which the 
perturbation is associated with the influence of a uniform electric field F upon the electron 
in the atom. Hence, the perturbed Schrödinger wave equation becomes 








2 827m ( e? ) 

Vow + E+ —eFz|)w=0. (14.96) 
h2 4T Eor 

Here, we have preserved Schrödinger’s notation in which the electric field strength is 

written as F to avoid confusion with the energy E of the stationary states. He now solves 

the perturbation problem in two different ways. 

He first uses the method of Epstein and Schwarzschild in which they transformed to 
the parabolic coordinates used in celestial mechanics and solved the wave equation in that 
coordinate system (see Sect. 7.2 and the relations (7.2)). The coordinate transformations 
adopted by Schrédinger are: 


x = yà cosø; y= VAiAgsing; z= 401 +22). (14.97) 


With this transformation to 1, 42, coordinates, Schrödinger’s wave equation can be 
written in self-adjoint, or Hermitian, form and the solution found by separation of variables, 
w = Aı(lAı) Ao(A2) ©($). First of all, the unperturbed solution is found using the associated 
Laguerre polynomials, which Schrödinger had found in Courant and Hilbert’s book, to 
describe the wavefunctions of the stationary states. Then, the solution for the perturbed 
wavefunctions under the influence of a uniform electric field is obtained. The perturbed 
stationary states have energies 
m.et 3h?eF 


E= ky — ky), 14.98 
8<ph?n2 2 mMee ník 1) ( ) 





where kı and kz are parabolic quantum numbers, corresponding exactly to the quantum 
numbers nı and na introduced in the classical Epstein-Schwarzschild result (7.7) in the 
old quantum theory. The principal quantum number isn = nı + n2 + n3 as in Sect. 7.2. In 
addition, Schrédinger noted that, just as in Heisenberg’s theory, there are stationary states 
with zero orbital quantum number and so the pendulum orbits, which had to be excluded 
on heuristic grounds in the old quantum theory, do not exist. 

But Schrédinger did not leave matters there — he wanted to evaluate the intensities of 
the lines observed in the Stark effect as well. Here, he made use of his discovery of the 
equivalence between wave mechanics and matrix mechanics so that the matrix elements, 
which he had shown could be derived from wave mechanics, were identified with the dipole 
moments associated with transitions between stationary states (see Chap. 15). 
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Finally, he returned to his calculation of the stationary states in the presence of an electric 
field and recast the whole calculation in terms of (r, 6, &) coordinates, what he referred to 
as the method of Bohr. Following a lengthy calculation, he demonstrated that exactly the 
same result as (14.99) was obtained, remarking that 


‘On the whole, we must admit that in the present case the method of secular perturbations 
[second approach] is considerably more troublesome than the direct application of a 
system of separation [first approach].’ 


14.8 Quantisation as an eigenvalue problem (Part 4) 
e A) 


The fourth paper of the series, received by the Annalen der Physik on 21 June 1926, 
concerned the development of the time-dependent form of Schrödinger’s wave equation. 
In the first three papers of the series, Schrödinger had successfully developed the wave 
mechanical formalism for time-independent wave phenomena, but now he needed to extend 
the procedures to time-dependent phenomena such as, for example, particle scattering and 
transitions between stationary states in which the eigenvalues change from their initial to 
their final states. He achieved this by returning to the original form of the wave equation 
and understanding how it could be modified when, for example, the potential term is 
time-varying. 
In time-independent form, the wave equation 











(E-—V) 3y 87m 
V?y — 2m m 9m” 0 becomes V’y + 2 (E —V)w =0, (14.99) 
where the time dependence of the wavefunction is assumed to be of the form 
yy œ real part of (e*?71##/*) , (14.100) 


and E = hv = ha. It follows from this relation that 























dy = _2miE (etmir ny u ome dy = 47? E? (e*2riE1/ h _ _4n?E? 
dt — h = h ý dt? h2 h2 
(14.101) 


Schrödinger’s aim was to eliminate the energy E ofthe eigenstate from the wave equation 
so that it could become time-variable. He noted that the first differential of w in (14.101) 
determines the quantity Ew in terms of dy/dt and so he makes this substitution into 
(14.99): 


872m Arnim ow 
Vy =+ —. 

h? j h ðt 

Schrödinger recognised that, by making this substitution, he required the wavefunction to 


be complex, but he had a prescription for overcoming this problem. 


Vw 








(14.102) 


“We will require the complex wavefunction y to satisfy one of these two equations [14.102]. 
Since the conjugate complex function y will then satisfy the other equation, we may take 
the real part of y as the real wave function (if we require it).’ 
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The necessity of introducing complex numbers into quantum mechanics to describe the 
time dependence of the wavefunction can be understood from elementary arguments starting 
from de Broglie’s relation, as described in the endnotes.!! 

Having established the time-dependent wave equation, Schrödinger was in a position 
to tackle a wide variety of problems in quantum physics. In the remainder of Part 4, he 
concentrates upon the problem of the theory of dispersion, a problem which had been 
successfully tackled by Kramers and Heisenberg (1925) (see Sect. 10.3). He takes the 
incident radiation to be described as a perturbation to the potential V associated with the 
electric field of the incident waves. Thus, the potential can be written 


V = Vo + A(x) cos(2x vt) , (14.103) 


where A(x) is the perturbation to the potential due to incident electric field F of the light 
waves and can be written —F }_ e;z;. Then, the time-dependent Schrödinger equation 
becomes 
2 ; 

= (Vo + A cos 2rvt)y = atm - : 
Schrédinger proceeds to solve this equation, first by finding the unperturbed solution and 
then by treating the time-varying incident field as a perturbation of that solution. We will 
not go through that analysis but simply note the form of the perturbed wavefunction, 


ade (==) 








Vy — (14.104) 








oo 
P 5 3 AjyUn(X) 


n=1 


|= (2rit/h)(E,+hv)  exp(2rit/h)(E; — hv) 


. (14.105 
E,—E, +hv E,—E, —hv | ( ) 


The first term on the right-hand side of the expression is the free vibration of the system. 
The second term shows the effect of the perturbation caused by the incident radiation. Note 
that this formula excludes the resonant case in which hv = Ep — En. The important point 
is that, just as in Kramers and Heisenberg’s analysis, there are two terms corresponding 
to the cases of induced absorption and emission of radiation from states with energy hv 
displaced from Ex — Ey. 

The aim of the calculation was to determine the electric dipole moment, or polarisation, 
of the medium under the influence of the incident radiation field and Schrédinger adopted 
the 


‘heuristic hypothesis [that] the field scalar y represents the electric density as a function 
of space coordinates and the time, if x stands for only three space coordinates, i.e. if we 
are dealing with the problem of one electron.’ 


Then, the integral of yy multiplied by the charge of the particle taken over all the coordi- 
nates of the system is taken to represent the distribution of electric charge. Integrating over 
all particles of the system, the induced electric dipole moment associated with the incident 
radiation could be found. Specifically, if the classical expression for the dipole moment is 
M, = > e;);, the resultant electric dipole moment is 


fu wp dx , (14.106) 
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where p is the weighting function which ensures that the wavefunctions are self-adjoint. 
Schrédinger found that his expression for the dispersion of the medium was of similar 
form to that deduced by Kramers and Heisenberg, but also was an improvement of their 
expression. In later sections of the paper, he goes on to consider the resonant and degenerate 
cases. 

Schrédinger appreciated that these revolutionary new procedures opened up the route to 
treating a vast range of problems in atomic physics. In his words, 


‘By superposing the perturbations due to a constant electric or magnetic field and a light 
wave, we obtain magnetic and electric double refraction, and the magnetic rotation of 
the plane of polarisation. Resonance radiation in a magnetic field also comes under this 
heading, ... Further, we can treat the action of an a-particle or electron flying past the 
atom in this way, if the encounter is not too close for the perturbation of each of the two 
systems to be calculable from the undisturbed motion of the other. All these questions 
are mere matters of calculation as soon as the [eigenvalues] and [eigenfunctions] of the 
unperturbed systems are known.’ 


14.9 Reflections 


Schrédinger’s was a quite astonishing achievement. To quote Jammer (1989): 


‘Schrédinger’s brilliant paper was undoubtedly one of the most influential contributions 
ever made in the history of science. It deepened our understanding of atomic phenomena, 
served as a convenient foundation for the mathematical solution of problems in atomic 
physics, solid state physics and, to some extent, also in nuclear physics, and finally opened 
new avenues for thought. In fact, the subsequent development of non-relativistic quantum 
theory was to no small extent merely an elaboration and application of Schrédinger’s work.’ 


The impact upon the community of physicists was instant for here was a system of 
wave mechanics which was founded upon well tried and tested procedures in analytic 
dynamics. Eigenvalues and eigenfunctions were the staple diet of the classical theorist 
and here they found new applications in physics at the atomic level. The techniques were 
immediately adopted by physicists such as Enrico Fermi and many others who found the 
matrix mechanics of Born, Heisenberg and Jordan and Dirac’s q-numbers obscure and 
difficult to apply to real physical problems. In contrast, they felt at home with Schrédinger’s 
transparent scheme of wave mechanics. 

But, there was still a long way to go. We have intentionally discussed only five of the six 
papers which Schrödinger published in 1926. We missed out his paper on the reconciliation 
of the Born—Heisenberg—Jordan approach to quantum problems by matrix mechanics with 
wave mechanics — this is the subject of the next chapter. Furthermore, spin has not yet been 
incorporated into the infrastructure of quantum mechanics. Schrédinger fully appreciated 
that spin had to be included in his scheme of wave mechanics and this would involve an 
extension of the methods he had so successfully deployed so far. A central role would be 
played by a deeper understanding of the role of linear operators. 
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Reconciling matrix and wave mechanics 





The physics community was now faced with two theories of quantum phenomena which 
could scarcely have differed more radically from one another and yet both had achieved 
remarkable successes in explaining precisely the same physical phenomena - the spectral 
lines of the hydrogen atom, the zero point energy of quantum systems, the quantisation 
of the harmonic oscillator, the quantum rotator and the Stark effect. Furthermore, both 
theories could account for the experimental data, unlike the predictions of the old quantum 
theory. Perhaps these quite different approaches are not so surprising when it is appreciated 
that matrix and wave mechanics started from the diametrically opposite poles of the wave— 
particle duality. 

At the heart of the Heisenberg approach! was the fundamental role played by the 
non-commutative behaviour of the quantum variables and the quantisation of both the 
momentum and spatial variables. To accommodate these features, a new mathematical 
calculus had been invented from the realisation that matrices followed precisely the correct 
algebraic rules. The elaboration of this scheme led to the concept of the energy levels of a 
quantum system being associated with the diagonalisation of matrices using the eigenvalue 
procedure. As Jammer remarks, the theory 


‘,.. defied any pictorial representation; it was an algebraic approach which, proceeding 
from the observed discreteness of spectral lines, emphasised the element of discontinuity; 
in spite of its renunciation of classical description in space and time it was ultimately a 
theory whose basic conception was the corpuscle. 


In complete contrast, Schrodinger 5 approach was firmly based upon de Broglie’s insight 
into the wave properties of particles and the need to describe these by a wave equation. 
Again, quoting Jammer, 


‘Schrédinger’s [approach]... was based on the familiar apparatus of differential equa- 
tions, akin to the classical mechanics of fluids and suggestive of an easily visualisable 
representation: it was an analytical approach which, proceeding from a generalisation of 
the laws of motion, stressed the element of continuity, and, as its name indicates, it was a 
theory whose basic concept was the wave.’ 


Neither Heisenberg nor Schrédinger were initially particularly impressed by the other’s 
approach. Heisenberg wrote to Pauli that 


‘The more I ponder about the physical part of Schrédinger’s theory, the more disgusting 
it seems to me,’ 


while Schrédinger stated that 
292 
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‘I was discouraged, if not repelled, by what appeared to me a rather difficult method of 
transcendental algebra, defying any visualisation.’ 


Despite these reservations, a number of authors, including Schrédinger, sought to under- 
stand how the different approaches could be reconciled. There were general similarities, for 
example, in the use of eigenvalues in matrix algebra and in the solutions of the Schrödinger 
wave equation. Schrödinger was in the vanguard of this reconciliation when he discovered 
what he referred to as ‘a formal mathematical identity’ of wave and matrix mechanics. But 
this was only one of a number of insights into the mathematics necessary to accommodate 
both theories in a coherent picture — Lanczos, Born, Wiener, Pauli and Eckart all made 
key contributions to the reconciliation of the two approaches. These endeavours led to a 
deepening of the understanding of the content of the new theories and set the scene for the 
development of the modern theory of quantum mechanics. Many of these developments 
took place almost simultaneously and independently — they are all part of Born’s ‘tangle 
of interconnected alleys’. Let us begin with Schrédinger and then introduce the insights of 
the other pioneers. 


15.1 Schrodinger (1926d) 


Schrödinger interrupted his series of papers Quantisation as an eigenvalue problem be- 
tween Parts 2 and 3 to write his paper On the relation between the quantum mechanics of 
Heisenberg, Born, and Jordan, and that of Schrédinger which was received by the Annalen 
der Physik on 18 March 1926 (Schrédinger, 1926d). He had been seeking an accommo- 
dation between the two theories, but as late as 22 February 1926 he still had failed to find 
a solution. Then, suddenly, in a matter of a couple of weeks, he found the answer he was 
looking for. In addition to demonstrating the equivalence of the theories, Schrédinger’s 
paper was a stout defence of the wave mechanical approach to quantum physics, claiming 
that it contained all the mathematical apparatus needed to account for the results obtained 
by matrix mechanics with the additional virtue of being visualisable. 

Schrédinger went straight to the heart of matrix mechanics, starting with Born’s funda- 
mental relation (12.14), from which all else followed 


h 
pa-qp= 5, (15.1) 
ri 


where p and q are the matrices associated with the momentum and position variables 
respectively and I is the unit matrix. 

Immediately, Schrödinger translates (15.1) into the language of operator calculus. As 
we will discover, all the methods of reconciling matrix and wave mechanics involved a 
deepening appreciation of the role of operators in quantum mechanics. Their significance 
lay in the fact that it was already well-known in the mathematical literature that operator 
calculus is non-commutative. Thus, if 4 and B are operators, the operator AB is not, in 
general, the same as the operator BA, in other words, such operators do not commute. These 
and many more of the key properties of operators had been surveyed by Salvatore Pincherle 
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in an important article of 1906 entitled Functional operators and equations published 
in the Encyclopddie der mathematischen Wissenschaften (Pincherle, 1906). We will find 
that the importance of operator calculus began to be appreciated by all those who sought 
to reconcile the different approaches to quantum mechanics and the essential tools were 
already formulated in the section of Pincherle’s review entitled The elements of operator 
calculus. 

It is worth quoting Schrédinger’s words to understand how he approached the reconcili- 
ation of the two theories. 


‘The starting point in the construction of matrices is given by the simple observation 
that Heisenberg’s peculiar calculating laws for functions of the double set of n quanti- 
ties G1, q2, ---, qn 3 P1; P2» --- , Pn (position- and canonically conjugate momentum co- 
ordinates) agree exactly with the rules, which ordinary analysis makes linear differential 
operators obey in the single set of n variables q1, q2,..., qn. So, the co-ordination has 
to occur in such a manner that each p; in the function is to be replaced by the operator 
d/dq. Actually, the operator 0/dq; is exchangeable with 9/ðqm, where m is arbitrary, 
but with qm only, if m + n. The operator, obtained by interchange and subtraction when 
m = l, namely, 

u (152) 
when applied to any arbitrary function of the qs, reproduces the function, that is, the 
operator gives identity. This simple fact will be reflected in the domain of matrices as 
Heisenberg’s interchange rule.’ 


This is a key point in the argument. If we write an arbitrary function of q; as 
W(G1s 92---dn) = W(q), then 


a a aw) 
E qı— qı >| v(@) 


= — [avi] =q Ja; 
qı 


0 0 
pota DP qO <q), 153 
qı ðqı 


confirming Schrödinger’s statement that the operator is an identity operator. 

With these words, Schrödinger begins his own development of the rules of operator 
calculus. He illustrates the rules by an example in which the function is described by a 
power series in the ps and qs, a single term of which might be 








F (gr, P) = f (qi, -- -> In) Pr Ps Pt 8(91, - - +s In) Pr Alqi, -< , In) Pr" Ps"... (15.4) 


Then, to convert to the operator formalism, he replaces the ps, for example p,, by the 
operator K 9/dq,, emphasising that the variables need to be ‘well-arranged’. By this, he 
means that, just as in the case of matrix multiplication, the order of the products of operators 
must be carefully adhered to — the corresponding differentiations must be carried out in a 


strict order. Thus, (15.4) is converted into an operator which Schrödinger writes as [F, -]: 
9° a 9? 
LF, -] = f@1,: -- » gn)K’ —— g(qı, dm) K 5 „ran ak tg 
dqr ods Oa ðq,” dqs" 


(15.5) 
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If the operator [F, -] acts upon the function u(q),...¢,), some new function [F, u] is 
created. The operators must obey the multiplicative rule that, if G is another well-arranged 
operator, applying G to [F, u], a new function [GF, u] is created, meaning that first the 
operator [F', -] acts upon u and then [G, -] operates upon [F, u]. In general, [GF, u] is not 
the same as [F'G, u]. 

Next, Schrödinger associates with the operator [F, -] a matrix F through the introduc- 
tion of a complete orthonormal set of functions for / < k, l < oo. Writing x for the set 
of variables q1, 92,...,q„ and f dx for the integration over all g-space, the normalised 
orthonormal set of functions is 


U4 (X)V/ P(x), ux(x)y P(x), us(x)y p(x), ...ad inf. , (15.6) 


with the normalisation conditions 


0 forix¢k, 
foonu = , 4 (15.7) 

1 fori=k, 
where p(x) is the weight function introduced to ensure that the normalised functions are 
self-adjoint. Schrödinger defines the elements of the matrix F* by the following integral? 


F" = f ecomoner. ui(x)] dx . (15.8) 
In his words, 


‘... a matrix element is computed by multiplying the function of the orthogonal system 
denoted by the row-index (whereby we understand always u;, not u;,/p) by the “density 
function” p, and by the result arising from using our operator on the orthogonal function 
corresponding to the column-index, and then by integrating the whole over the domain.’ 


Schrödinger identifies q; with the ‘scalar operator’ q; and p; with the differential operator 
K0/dq). Thus, matrix elements can be associated with q; and p; as follows: 





= fecucon ug(x) dx , (15.9) 
ik _ , durlx) 
Pi =K f du) ai dx. (15.10) 


Next, this formalism is used to derive Born’s quantum condition (15.1). The rule (15.8) 
can be used to derive the ik component of the matrix associated with the operator [F, -] = 


Pıgı — qı Pı 
(pig — qıp)* = J px)ui(x)[F, ux(x)] dx , (15.11) 


0 0 
= Kf Jouw [qi ur(&)] — Orr) dx, (15.12) 
qı ðqı 


0 fori£k, 
=K ; = 15.1 
foonu P a (15.13) 
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because of the fact that (15.2) is an identity operator and because of the orthogonality of the 
set of functions u;(x) according to (15.7). This is exactly Born’s quantum relation (15.1) if 
we set K = h/2ri. This calculation represents the introduction of the momentum operator 
hd 
Pi ðqi 





(15.14) 


into quantum physics. 

Schrödinger shows that every function F(p, q) can be translated into an operator [F, -] 
which in turn can be translated into a matrix F* using (15.8). Furthermore, these matrices 
obey all the rules of matrix mechanics required by Born, Heisenberg and Jordan (1926). 
As expressed by Jammer (1989), 


‘Any wave-mechanical equation could therefore consistently be translated into a matrix 
equation, the operation of F on the wave function y, corresponding to the application of 
the matrix (F“/) on the column vector (a;) whose components are the Fourier coefficients 


ofw. 


Schrödinger was again in full flight — next he takes the complete set of orthonormal 
functions to be the eigenfunctions of the wave equation, which could be written as an 
operator equation 


[H,v]=Ev, (15.15) 


where [ 7, -] is the operator associated with the Hamiltonian of the system. In fact, (15.15) 
is just his wave equation as can be seen as follows: 


H= 1 +p% + p?) + UQ, y, z) = Lp + Py Py + p-p-) + U(x, y, z) . 
(15.16) 
Converting this well-arranged function into an operator equation using px = 
(h/27i)ð/əx, py = (h/2m1) 0/dy, p- = (h/2ri)d/dz, and then substituting into (15.16), 
we find 





h? 32 32 9? 
822m (= = m = =) a =E 
827m 
h2 
exactly Schrédinger’s time-independent wave equation (14.37). This calculation is the 
‘standard’ method of deriving the Schrédinger wave equation using operator techniques. 
Next, Schrödinger had to establish the rules of differentiation of the operator F with 
respect to the position q; and momentum p; coordinates. Just as in the case of Born, 
Heisenberg and Jordan’s matrix mechanics, the definitions need some care. In his operator 
notation, Schrödinger showed that the appropriate forms are 





Vw + [E- U(x, y,2]v =0, (15.17) 








OF 1 
= PF- Fond. (15.18) 
[5 | = 2 [Fq - gıF, ] , (15.19) 
OPI K 
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where the p; and q; are variables, not operators. Born, Heisenberg and Jordan had shown 
that Hamilton’s equations of motion (12.67) could be written in matrix form, the ik elements 
of the matrices being 


aq) \ * aH\ik ap) \ i aH\ik 
N oh). Pi\ = , (15.20) 
ot OP] ot ðqı 

where / = 1, 2, 3,...n andi, k = 1, 2, 3,...ad inf. Now, according to (12.50), the time 


derivative of the matrix element qi associated with the pair of stationary states i and k is 
given by 





Qmi v(ik)gi* = 2ri(v; — vy) qf , (15.21) 


where the frequencies v; and v; have been associated with the states i and k respectively. 
Now identifying [F, -] with the Hamiltonian operator [H, -] and using (15.19) and the first 
equation of (15.20), we find 


; 1 ; 
(i = voqi" = z (Haqi — qH)" . (15.22) 


Schrödinger now chooses the complete set of eigenfunctions associated with his wave 
equation (15.15) as the basis for the determination of the matrix elements and so 
HY = By OO = yO (15.23) 
= P(x) u(x) u(x = ; 
' un 0 fori Xk. 
This expression is then used to work out the terms (Hq;)'* and (q; H)'*. We recall that we 
need to sum over all the possible intermediate states m to find the value of, say, (Hq)*. 
Note that this procedure can be traced back to Heisenberg’s rule (11.18) that the ‘classical’ 
Fourier term x(n, a) = $y x(n, a’)x(n, a — a’) should be converted into the matrix 
format of Born, Heisenberg and Jordan (1926) as x'* = )),, x’"x”*. Thus, 


Ba) Bar: (15.24) 
But, according to the rules (15.23), the only non-zero member of the infinite series is that 
for which i = m, and for this case H = E;. Therefore, 


(Ha) = Eiq} . (15.25) 
Performing the same analysis for (q; H)'* 


(Hy = J qi" H” = Ergi" . (15.26) 


m 


It immediately follows from (15.25) and (15.26) that, substituting these values into (15.22), 
1 
(i = w) = z (Bi — Er) (15.27) 


relating the frequencies of transition to the differences of energies of the stationary states. 
Schrödinger was triumphant: 


‘Hence the solution of the whole system of matrix equations of Heisenberg, Born and 
Jordan is reduced to the natural boundary value problem of a linear partial differential 
equation. If we have solved the boundary value problem, then by the use of [15.8] we can 
calculate by differentiations and quadratures every matrix element we are interested in.’ 
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These were considerable achievements and demonstrated the equivalence of matrix and 
wave mechanics. Strictly speaking, Schrédinger had demonstrated that he could trans- 
late wave mechanics into matrix mechanics and obtain all the results derived by Born, 
Heisenberg and Jordan. It was not so obvious that the translation would work completely 
symmetrically in the opposite direction, namely, did matrix mechanics necessarily imply 
wave mechanics, or did it contain additional features beyond those which could be described 
by wave mechanics? A number of authors were hot on the trail. 


15.2 Lanczos (1926) 


In fact, Schrödinger was not the first to seek a reformulation of the matrix mechanics of Born 
and Jordan (1925b) in terms of continuous functions. As soon as their paper of 1925 was 
published, Kornel (Cornelius) Lanczos showed that their matrix formulation of quantum 
mechanics could be written in terms of the kernels found in integral equations. This did 
not necessarily advance the cause of the new quantum mechanics among the community 
of physicists since, as remarked by Jammer (1989): 


“...the use of integral equations with which physicists were — and still are — much less 
familiar than with differential equations, the absence of any specific example or new result, 
and certainly also the fact that the publication almost coincided with that of Schrödinger’s 
first communication, explain the relatively cool reception Lanczos’s paper was given.’ 


Lanczos had already used integral equations in his studies of general relativity in the weak 
field limit and had become familiar with the techniques through Hilbert’s pioneering study 
of 1912 (Hilbert, 1912) and the relevant chapters of Courant and Hilbert’s Methods of 
Mathematical Physics (1924). 

A key feature of Heisenberg’s paper of 1925 was that the functions depended upon pairs 
of variables, say m and n. Lanczos appreciated that the kernel function K(s,o) found 
in integral equations also depended upon two points s and o in a coordinate space of 
arbitrary dimensions. If the kernel is symmetrical, a system of eigenfunctions ¢'(s), with 
s = 1, 2, 3,...can be associated with the solution of the integral equation 


p(s) = a f Ko. o)d(o)do , (15.28) 


where à is the corresponding eigenvalue. The system of eigenfunctions is orthonormal, 
complete and satisfies the condition 


[ e'e'eyas = ee (15.29) 
0 fori£k. 


Hence the function f(s, o) can be expanded in terms of the ¢'(s) as 


f(s,0) = >> aj(o) g's). (15.30) 


I 
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Then, the a;(o) can be expanded in terms of the same infinite sequence of eigenfunctions 
¢*(c) and so 


6.0)=) nd (90): (15.31) 

i,k 
Lanczos identified the matrix a = [a;,] with the matrices of Born and Jordan’s matrix 
mechanics. He further demonstrated how the complete system of matrix mechanics in Born 


and Jordan’s paper could be expressed in terms of the kernels of linear integral equations. 
He concluded, 


‘Because of the mutual unique relation existing between the matrices and the kernel 
functions used in our representation, it does not matter at all for the formal treatment of 
the [quantum mechanical] problem, whether one writes down the fundamental equations in 
the form of integral equations — as we have done here — or whether one starts immediately 
from the constituent coefficients and applies the matrix equation.’ 


The similarities with Schrödinger’s paper are apparent and he acknowledged Lanczos’s 
contributions in a footnote to his paper. There is no question, however, about the greater 
impact of Schrédinger’s paper which was couched in much more accessible mathematical 
terms and was also part of a sequence of ground-breaking insights into the nature of the 
new quantum mechanics. 


15.3 Born and Wiener’s operator formalism 
E) 


In Sect. 12.3, we left Max Born on his way to the Massachusetts Institute of Technology, 
having completed the ‘Three-Man Paper’ with Heisenberg and Jordan. Born had met 
Norbert Wiener in 1924 when the latter was a visitor to Göttingen. An exchange programme 
was set up by Courant and Wiener, one of the first fruits of this collaboration being Born’s 
appointment as a ‘foreign lecturer’ at MIT for the fall term of 1925-1926. Born’s lectures 
were centred upon the new matrix mechanics, but he was well aware of the shortcomings 
of what he, Heisenberg and Jordan had achieved. They had found it impossible to deal 
with aperiodic motions, including linear motion in a straight line, and, as described in 
Sect. 12.4, there was no simple equivalence for the action—angle variables which were the 
natural language of the old quantum theory. Furthermore, there were real mathematical 
problems with Born’s assumptions about the properties of unbounded matrices. Despite 
these concerns, the theory agreed well with the experimental results and eliminated the 
intractable problems of the old quantum theory. Born fully realised that an extension of 
matrix mechanics was needed and Wiener turned out to be exactly the person to do this. 
Wiener was a mathematical prodigy who was to make many fundamental contributions 
to stochastic processes, in particular to the mathematical elaboration of fundamental topics 
in electronic engineering, electronic communication and control systems. Just before Born 
arrived in Cambridge, Massachusetts, Wiener had written a comprehensive account of the 
operational calculus, in particular, a rigorous study of the concept of operators, including 
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Volterra’s integral transformations and Pincherle’s transformations of one power series to 
another (Wiener, 1926). In his autobiography, Wiener wrote: 


“When Professor Born came to the United States he was enormously excited about the 
new basis Heisenberg had just given for the quantum theory of the atom. Born wanted a 
theory which would generalise these matrices . . . . The job was a highly technical one, and 
he counted on me for aid. ...I had the generalisation of matrices already at hand in the 
form of what is known as operators. Born had a good many qualms about the soundness 
of my method and kept wondering if Hilbert would approve of my mathematics. Hilbert 
did, in fact, approve of it, and operators have since remained an essential part of quantum 
theory.’ (Wiener, 1956) 


As Jammer (1989) notes, the mathematical problems associated with the formulation of 
the new quantum mechanics fostered a spirit of collaboration between the pure mathemati- 
cians and the physicists, among the first fruits of this collaboration being the joint paper by 
Born and Wiener which was received by the Zeitschrift fiir Physik on 5 January 1926 (Born 
and Wiener, 1926). Jammer provides a clear description of the route Born and Wiener 
followed in formally introducing operators into quantum mechanics — we will follow his 
presentation here. The main topic of Wiener’s paper of 1926 had been a generalisation 
of the Fourier integral which made it possible to apply integral operators to both analytic 
and non-analytic functions and this found immediate application in Born and Wiener’s 
extension of matrix mechanics. 

Consider the Fourier series for the two functions y(t) and x(t) of time t: 


y(t) = Y ym exp (2riWnt/h) , (15.32) 


m 


x(t) =) xn exp (2riWıt/h). (15.33) 


The aim of the calculation is to develop an operator expression relating y(t) and x(t), 
written symbolically y(t) = qx(t), where q will turn out to be an integral operator. The x, 
are found by multiplying (15.33) by exp (—27iW;t/h) and integrating over all £ from —oo 
to +00. Then, the only non-zero term is that for which k = n, so that 


i yet 
Xn = lim val x(s)exp(—271W,,s/h) ds . (15.34) 
T=00 2T PR 

Now, let us adopt the matrix transformation 

Yn = N dmn Xn - (15.35) 
Then, the expression for y(t) becomes 
V(t) = Y Ym exp Ti Wnt /h) = È Ginn Xn xp (2riWnt/h), (15.36) 
1 +T 

= Jim of Sdn) exp ri Wat = Was) /A] ds (1537 
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If we now write 


q(t, s) = J dmn exp [271i (Wnt — Wns) /h] , (15.38) 
we can write 
. 1 +T 
y(t) = lim aT Í, q(t, s)x(s)ds . (15.39) 
Thus, x(s) is converted into y(t) by the integral operator 
1 ET 
q = lim — a us; (15.40) 


T=co 2T 
where it is understood that the function ne is to be taken inside the integral. They then 
introduced the definition of a linear operator: 


‘An operator is a rule in accordance with which we may obtain from a function x(t) 
another function y(f).... It is linear if 


arte) + yO] = qx) + y(t) ? (15.41) 


Having established the procedure for obtaining the relation y(t) = qx(t), Born and Wiener 
next consider the operator Dg = 3q /dt. From (15.39), we find 








1 ft? dq(t,s) 
p= Dq x(t) = dir — - ; 15.42 
vi) = Daro = fim zy |, roas (15.42) 
Now, using the relation (15.38) for q(t, s), differentiation gives 
dq(t, 201 : 
n 2m TE D dmn Wn exp [27i (Wnt — Was) /A] - (15.43) 


m,n 


Just as (15.38) defines the relation between the operator q and the matrix qmn, so (15.43) 
provides the relation between the operator Dq and the matrix element (Dq )mn, namely 














201i 
(Dq)mn = (= udn) . (15.44) 
Next, consider the operator gD = q9/dt. Then, 
ax(t) .. i fF ax(s) 
= = lim — t, ds. 15.45 
var), ET Gene 
Carrying out a partial integration, this integral becomes 
ax(t) >. Lf. 064) 
= =-| t ds. 15.46 
nn imf, B a 


Now, using the relation (15.38) for g(t, s), differentiation gives 


dg(t, 21 : 
a 2: n Amn Wn exp [2i (Wmt — Was) / h] . (15.47) 
S 


m,n 





We find the relation between the operator q D and the matrix element (q D)mn, namely, 


201 


(GD) mn = (= 7) : (15.48) 
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Hence, the matrix element mn associated with the operator (Dq — q D) has elements 





2 i mn 
(Dq — q Din = =). (15.49) 
Since hvyn = W,, — Wa, it follows that 
(Dq = q D)mn = 27 iWVmnGmn : (15.50) 


Born and Wiener recognised that the right-hand side of (15.50) corresponded to the 
derivative of the mn component of the matrix with respect to time, as demonstrated by 
(12.49) and (12.50), and so they identified the right-hand side with ¢,,,. The corresponding 
operator equation becomes 


Dq-qD=ġå. (15.51) 


This operator equation can be compared with the corresponding matrix equation (12.53). 
Given the clear correspondence between the matrix and operator formalisms, Born and 
Wiener went on to identify the operator form of the commutation relation as 


h 
ie Ze aly (15.52) 
T1 


where 1 is the unit operator, and the canonical equations 


. _ 0H(pq) Ə H(pq) 

q= 3p = ag 
as operator equations in which the operators p and q are Hermitian. This provided a 
complete equivalence between the operator and matrix mechanics formalisms. 

With this new formalism, they identified the energy operator with (1/271) D. In addition, 
they solved the problems of the quantised harmonic oscillator and linear motion, thus 
demonstrating the ability of the scheme to deal with both periodic and aperiodic motion. 

What is intriguing is that they did not make the simple identification of the momentum 
operator p through the relation p = (h/2ri) 0/0q, which would have led directly to wave 
mechanics and Schrödinger’s wave equation. As Born lamented,’ 





(15.53) 


“We expressed the energy as d/dr and wrote the commutation law for energy and time as 
an identity by applying [t(d/dr) — (d/dt)t] to a function of t; it was absolutely the same 
for q and p. But we did not see that. And I never will forgive myself, for had we done 
this, we would have had the whole wave mechanics from quantum mechanics at once, a 
few months before Schrédinger.’ 


15.4 Pauli’s letter to Jordan 
S| 


Born sent a copy of his paper with Wiener to Heisenberg in January 1926. Heisenberg 
then forwarded a copy to Pauli as well as a copy of the paper by Lanczos. Pauli was 
characteristically scathing about the formal mathematics developed by these authors and 
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15.4 Pauli’ letter to Jordan 


developed his own version of operator methods, which we will not enter into here. More 
significantly, in early February, Sommerfeld had asked him to look out for the publication 
of Schrédinger’s paper on wave mechanics which Sommerfeld described as involving a 
‘totally crazy method’, but which reproduced many of the results of matrix mechanics. 

Pauli worked on Schrédinger’s paper over the Easter holidays while he was visiting 
Copenhagen. By early April, he had achieved what he later described as the ‘complete clar- 
ification of the connection of Schrédinger’s theory and quantum mechanics’. The argument 
was set out in a letter of 12 April 1926 to Pascual Jordan.* Pauli was strongly impressed by 
the fact that Schrödinger had been able to recover precisely his results for the hydrogen atom 
and also by the fact that the two theories seemed to be based upon quite different hypotheses 
and mathematical techniques. The first part of the letter to Jordan simply reformulates de 
Broglie’s insights and the derivation of Schrödinger’s wave equation. The second half of 
the letter concerns Pauli’s reconciliation of the two theories. 

For simplicity, Pauli considers only the one-dimensional case of the Schrédinger wave 
equation which he writes 


Cy + 87 2mo 
dx? h? 
As Schrödinger had shown, for E < Epot, the equation only has solutions for specific energy 


eigenvalues E1, E2, E3, ... with corresponding eigenfunctions Y1, Y2, ... and these form 
a complete orthonormal set with 


f Vn Ym dx = k Se (15.55) 


1 forn=m. 





[E — Era) y = 0. (15.54) 


It is simplest to quote the central insights of the letter in Pauli’s own words: 


‘Now one considers in particular the expansion of x Yn: 





+00 
<=) une) with Xam =f x Wn Vin dx . (15.56) 

m x 

One also puts 
ih [T° dW, . ih OW, 

xmm — 7 a. ei dx; th — = x)nm Ym . 15.57 
(px) 27 J. Ox v = 2n Ox Ls) n(x) ( ) 
.. NOW, if Xum = Xmn is real, (Px)nm = —(Px)mn is purely imaginary. It can be shown 


without difficulty, that the matrices for x and p, thus defined satisfy the equations of the 
Göttingen mechanics. Namely, 


h 1 

p,x— xp, = — I and H= ——p?+E,x(x), (15.58) 
201 2mo ` 

with H representing the diagonal Hamiltonian matrix whose elements E,„ provide the 

energy eigenvalues. From the rule of multiplication it follows that the matrix belonging 

to any function F(x) of x is just given by the coefficients 


+00 
Fam = / F(x) Wn Wn dx. (15.59) 


x 


I shall not write out the calculation in detail; you will be able to verify the assertion easily.’ 
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Reconciling matrix and wave mechanics 


Pauli fully appreciated the great power of Schrédinger’s approach which could be ex- 
tended straightforwardly to incorporate perturbation theory and so to the solution of nu- 
merous problems including the intensities of the spectral lines in the Zeeman effect. Two 
further quotations from this letter are particularly noteworthy: 


‘In principle, in the Göttingen theory as well as in de Broglie’s statement of the quantum 
problem, no description of the electron in the atom in space and time is given.’ 


‘It seems that one also sees now, how from the point of view of quantum mechanics the 
contradistinction between ‘point’ and ‘set of waves’ fades away in favour of something 
more general.’ 


By 18 April 1926, Schrödinger had learned about Pauli’s achievements and wrote to 
Sommerfeld the same day: 


“With Pauli I have exchanged a couple of letters. He really is a phenomenal person. How 
he has discovered everything again? In a tenth of the time which I needed for it!” 


15.5 Eckart and the operator calculus 
eae —E—E————EE—————————— es 


In 1925, the 23-year old Carl Eckart won a fellowship to work at the California Institute 
of Technology. He had the good fortune to attend the lectures which Born presented on his 
latest research with Wiener into operator methods in quantum mechanics. These proved 
to be the inspiration for his synthesis of the wave and matrix mechanical approaches to 
quantum theory. Following Born’s lectures, he made a thorough study of the operator 
formalism and, in his words, 


‘The result was that I...was completely familiar with what is now known as the 
Schrédinger operator (the energy operator) before Schrédinger’s papers appeared in 
Pasadena.’ 


Then the papers of Lanczos and Schrédinger were published in March 1926. Eckart re- 
alised that these were fully consistent with the Born—Wiener operator approach. What he 
appreciated was the following: 


‘The Lanczos quantum theory requires a set of orthogonal functions, by means of which 
the matrices of Born and Jordan are to be determined... Schrödinger had published a 
quantum principle which leads directly to a set of orthogonal functions... If these are 
interpreted as the functions entering into the Lanczos theory, the matrices of the Born- 
Jordan theory are readily obtained.’ (Eckart, 1926b) 


Eckart published his paper in the Proceedings of the National Academy of Sciences to 
establish the precedence of his ideas and these were elaborated in much more detail in his 
paper published in the Physical Review later the same year (Eckart, 1926a). 

The motivation for these studies was the fact that, although the new Born—Wiener scheme 
of operator dynamics provided a more rigorous foundation for quantum mechanics, it was 
still difficult to find solutions for more than a few of the simpler problems of atomic 
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15.5 Eckart and the operator calculus 


physics. Eckart’s objective was to sythesise the approaches of Born and Wiener, Dirac and 
Schrödinger into a single operator calculus in which all the quantities appearing in the 
equations are operators. Again, all the necessary tools were already available in Pincherle’s 
review (Pincherle, 1906), but now they had to be reformulated for applications in atomic 
physics. 

In the first part of his paper in the Physical Review, Eckart rewrote the equations of 
classical mechanics entirely in the language of operators. Thus, Hamilton’s equations now 
became 


dP;  ƏH(P,Q) dQ; _ dH(P,Q) 
dT a0; ° dr aP, ° 





(15.60) 


where the quantities, P, O, T and H are all operators. In his first paper, Eckart noted that, 


‘In translating ordinary equations into operator notation, nothing new is introduced. The 
transition involves merely a change of mental focus. Instead of concentrating the attention 
on the numerical quantities, it is directed to the operations of combining them.’ 


The words are similar to those already quoted by Dirac: 


“... the classical equations are to be retained formally without alteration... only the 
operations by which the quantities involved are combined are to be altered.’ 


Let us outline what Eckart did. His operator calculus made use of the results already 
obtained by Dirac and by Born and Wiener. To do this, he adopted general linear opera- 
tors which obeyed Born’s condition (15.41) and defined an operator D, by the following 
equation, 


D,X- XD, = [18], (15.61) 


where the right-hand side denotes the unit operator. This expression ensures that the op- 
erators are non-commutative. Then, following the rules of matrix calculus of Born and 
Jordan and the g-number calculus of Dirac, the following algebraic rules for operators were 
established: 


dQ 
— =D,Q-OQD,., 15.62 
1Y xQ — QD, ( ) 
(0+ P)= 2E (15.63) 
dX’ 
op = “0, + os. (15.64) 
The definition (15.61) and the rule of differentiation (15.62) could be generalised to a 
number of independent operator variables Q1, Q2, ..., Qn so that 
Dj; QO; — O:D; = [89], (15.65) 
dF 
o ee (15.66) 
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Reconciling matrix and wave mechanics 


Then, the commutator relations for the quantum operators can be written 








h 
PO;-0;P = | 7 ôij i 
pai [zx s 15.67 
0:0; - Q;Q:=0, 1.07) 
PP;-P;P =0, 
and the dynamical equations (15.60) become 
P; H(P 
ad, pt HARD 
dT dt dt 99; (age 
do do 9 4 _ aH.) 2 
de dd’ "u aP ` 


Next, the scheme had to be converted into a form which would allow numerical quantities 
to be compared with experiment. Eckart used the result which could be derived from Born 
and Wiener’s paper, namely, 


1 1 
Er iW = pi, —O;v=4, (15.69) 
vw! ve’ 
or, more generally, 
1 
—Fv=f, (15.70) 
y 


where F is any function of P; and Q;. For stationary states, Born and Wiener had used the 
result that the time dependence of the eigenfunctions takes the form W, x exp(27iW,,t/h) 
and so, using (15.70) for the Hamiltonian operator H(P; Q ;), it follows that 
1 
GP Qa = W,. (15.71) 
Eckert noted immediately that 


[15.71] will be seen [to be] the equation published by Schrödinger in another 
form... which, in addition to defining w,, serves to distinguish a certain discrete se- 
quence of values of W from all others.’ 


To complete the argument, Eckart had to define the operators associated with Q; and P; 
and he chose the position operator Q ; to mean multiplication by the scalar quantity q; and 
P; to be the canonically conjugate momentum operator 
oh ð 
I 2ni ðqi ` 





(15.72) 


Inserting these definitions into (15.71), Schrödinger’s wave equation follows directly and 
hence the solution of the energies of the stationary states of the hydrogen atom. To complete 
the picture, Eckart showed how to find the matrix elements O ;(k) associated with the 
general operator F, 


O;f =>) FrQj(nkyWx. (15.73) 
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15.6 The WKB approximation 


He concluded his paper with the remark that: 


‘The method of obtaining the matrices which has just been outlined differs very little 
from Lanczos’s interpretation of the matrix calculus which is thus included in the present 
calculus.’ 


This work, submitted to the Physical Review on 7 June 1926, but only published in October 
1926, completed the formal identity of matrix and wave mechanics and demonstrated the 
power of operator techniques in quantum mechanics. 


15.6 Reconciling quantum mechanics and Bohr’s quantisation 


of angular momentum - the WKB approximation 
ee | 


An issue which perplexed Bohr and his colleagues was: why was the simple Bohr model of 
the atom so successful when it now appeared that the quantisation of orbits was associated 
with boundary conditions which had to be imposed upon permissible solutions of the 
Schrédinger wave equation at infinity? The answer was soon provided by the independent 
analyses of Gregor Wentzel (1926a), Kramers (1926) and Léon Brillouin (1926), who used 
what is now known as the WKB approximation.’ 

Except for a few special cases, analytic solutions of Schrédinger’s equation are not 
available, but the WKB approximation enables approximate solutions to be found in cases 
in which the potential varies slowly with position x. By ‘slow’, we mean that the changes 
in potential are small on the scale of the de Broglie wavelength. These were the cases 
discussed independently by Wentzel, Kramers and Brillouin. In the case in which there 
is no spatial variation of the potential function, the result corresponding to the solution 
in classical mechanics is found. In the case in which the potential is slowly varying, 
the solution to first order in the small quantity h/2ri results in Sommerfeld’s quantum 
relation 


§ pax =nh. (15.74) 


In the later analyses by Kramers and his colleagues (Niessen, 1928; Kramers and Ittmann, 
1929), it was shown that the relation is in fact 


$ pax = +a. (15.75) 


Enrico Persico (1938) gives a simple derivation of the result (15.75) starting from 
Schrödinger’s equation, making exactly the same approximations as in the papers by Wentzel 
(1926a), Kramers (1926) and Brillioun (1926). First, Persico writes the one-dimensional 
Schrédinger wave equation for an oscillator in the usual form 


Py 822m 
ae? a PR) 





(E - U} =0. (15.76) 
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Reconciling matrix and wave mechanics 





Fig. 1 





Fig. 2 


Fig. 1 (top) illustrates the potential function of a harmonic oscillator with various values of U and £ described in the 
text. Fig. 2 (bottom) shows a solution of the Schrödinger wave equation for a harmonic oscillator for principal 
quantum number n = 8. Both diagrams are from Persico (1938). 


The potential U and the solution of Schrédinger’s wave equation as a function of x are 
illustrated in Figs. 1 and 2 of Fig. 15.1 which are taken from Persico’s paper. It can be seen 
that the oscillations of the wavefunction are much more rapidly varying than the change 
in the potential U. To a good approximation, in the region AB away from the immediate 
vicinity of the points A and B at which E = U, the angular frequency of oscillation is 


822m 
a(x) = | 72 (E-U), (15.77) 


2 
= +o (xy =0. (15.78) 


If U were indeed independent of x, the solution for w(x) would be 





being the solution of 


w(x) = C sin(ox +0). (15.79) 
A better approximation would be to seek a solution of the form 
w(x) = F(x) sin[S(x)] , (15.80) 
where F(x) is a slowly varying function of x. Inserting this trial solution into (15.78), we 
find 
(F" — FS? +° F)sinS + (2F'S’ + FS")cosS =0, (15.81) 


where the dashes mean differentiation with respect to x. To satisfy this equation, the 
expressions in front of both the sine and cosine terms must be zero and so we obtain two 
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15.6 The WKB approximation 


conditions, 
Ferrari: 2F'S +F” =0. (15.82) 


If F is a slowly varying function of x, |F”| < œ?|F|, and so the first expression in (15.82) 
becomes 


wo = 5”. (15.83) 
The solution for S is then 
s= | war+o, (15.84) 
x 
where 6 is a constant. 
The second expression in (15.82) can be integrated immediately to give 
C c 
F = =, 
VS Jo 
using the result (15.83). Hence, the wave function in the region of the oscillations can be 
approximated by 





(15.85) 


v= sin( [oar +0) l (15.86) 


Now, it is clear that the solution (15.86) is not valid at the points A and B, at which 
E — U = 0. Dealing with A first, Persico introduces the trick of replacing the parabolic 
function U with a ‘step’, indicated by the dashed lines in the vicinity of A in Fig. 1 of 
Fig. 15.1. In the interval A to A”, the potential takes the constant value U; and in the 
interval A to A’, the constant value U2. The potentials U) and Us are chosen such that 


UV -E=E-U. (15.87) 


Therefore, in the interval A to A”, the frequency of oscillation takes the constant value 


822m 
w = 32 (E — U1). (15.88) 


Inserting this value into (15.86), the wavefunction measured from A is 





y= T sin [oi (x — x1) + 6] . (15.89) 


Now consider the wavefunction in the interval A’ to A. In this region, U, > E and so the 
wave equation becomes 


2 
SW - Ey = 4" - = 
the first equality resulting from our choice of the values of U, and Uz in (15.87). The 
solution of (15.90) is an exponentially decreasing function of decreasing x in the interval 
AA’ with exponent w1, 


n 


2m 
— (E — Uy =" — av =0, (15.90) 








u(x) = ae? , (15.91) 
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Now we need to join together the solutions (15.89) and (15.91) at A. As usual, for any 
wave, the function and its first derivative must be continuous at A and so 


yx, — ixi — 


— sind, aowe —_ 910080. (15.92) 


ae 
yoı yoı 


Dividing the first by the second equation in (15.92), we find tan = 1, 6 = 7/4. Hence 
the approximate solution for y in the interval A to B is 


rie 7 sin (foa a *) (15.93) 


This is the WKB solution found by Kramers and his colleagues. Notice the important point 
that the first maximum of the wavefunction occurs at phase 7/4 from x; in the positive 
x-direction. 

Similar considerations apply to the wavefunction in the vicinity of B, the point x2 lying 
z/4 beyond the last maximum. Therefore, the total phase difference between A and B, or 
X1 and x Ds is 


2 T 1 
f od=nn+—-=|(n+-<-|rn, (15.94) 
Pe 2 2 

where n is found by counting the number of nodes between A and B. 

Now, classically, the momentum of the electron is p = +,/2m(E — U) and so is directly 
related to the angular frequency of the wavefunction through (15.77), p = hw/2z. If we 
now take the integral of p dx through one complete oscillation of the harmonic oscillator, 
that is, from x, to x, and back to xı, we find 


X2 h x2 
§ pax =2 [ pax == | wdx, (15.95) 


frax=(n+5)h, (15.96) 


precisely with Bohr-Sommerfeld quantisation condition, including the term sh. 

We recall that there is no mention of the motion of the electron in the atom according 
to either wave or matrix mechanics, a point emphasised in the quotation by Pauli at the 
end of Sect. 15.4: ‘no description of the electron in the atom in space and time is given’. 
It seems best to regard Bohr’s quantisation of angular momentum as a lucky coincidence 
which opened up the route to the old and new quantum physics. 





and hence, from (15.94), 


15.7 Reflections 


By mid-1926, it was becoming clear that the operator calculus was the way ahead for 
the study of quantum problems, stimulated by the remarkable insights of Born, Dirac, 
Eckart, Heisenberg, Jordan, Schrödinger and Wiener, to mention only those whose work 
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15.7 Reflections 


has been highlighted in this chapter. Born was unstinting in his praise for Schrédinger’s 
achievement. In his obituary of Schrödinger, he referred to his papers of 1926 as ‘of a 
grandeur unsurpassed in theoretical physics’ (Born, 1961b). Planck was equally effusive: 


‘[Schrédinger’s wave equation] plays the same part in modern physics as do the equations 
established by Newton, Lagrange and Hamilton in classical mechanics.’ (Planck, 1931b) 


The next few years saw an extraordinary flowering of quantum mechanics as more and 
more flesh was put on the skeleton of the operator calculus. 
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The discovery of the spin of the electron by Uhlenbeck and Goudsmit was a major ad- 
vance in the understanding of physics at the atomic level. Its discovery coincided with the 
development of both matrix and wave mechanics and its incorporation into the scheme of 
quantum mechanics and statistics led to deeper understanding of the underlying structure 
of quantum mechanics. Almost immediately, Heisenberg and Jordan used the new scheme 
of matrix mechanics to derive the expression for the g-factor which Landé had derived 
empirically from a very close study of the anomalous Zeeman effect. An important con- 
sequence of these developments was that the different approaches of matrix and wave 
mechanics were brought together. In particular, the discovery of spin as a new quantum 
number suggested the possibility of understanding systems containing more than one elec- 
tron. Heisenberg’s analysis of the helium atom was to pave the way for the full incorporation 
of spin into quantum mechanics and quantum statistics. 


16.1 Spin and the Lande g-factor 


312 


The story of the discovery of the spin of the electron by Uhlenbeck and Goudsmit (1925a) 
was told in Sect. 8.5. As discussed in that section, their discovery was based upon empirical 
studies of the regularities observed in the anomalous Zeeman effect, inspired by the intricate 
analyses of Landé. Although based originally upon the classical concept of a rotating 
electron, electron spin is a purely quantum mechanical property intrinsic to the electron. 
Opinions were strongly divided about the validity of the concept, Pauli taking a strongly 
negative position, while Bohr, Heisenberg and Jordan took a more positive view. The 
challenge taken up by Heisenberg was to find a quantum mechanical solution for the 
anomalous Zeeman effect using the concept of a spin-4 particle within the context of their 
recently completed matrix formalism. 

Uhlenbeck and Goudsmit’s innovations were to postulate that the electron has an intrinsic 
spin with spin quantum numbers s = + 5 and that the gyromagnetic ratio associated with 
electron spin should be twice that associated with the electron’s orbital motion. We recall 
that the gyromagnetic ratio for orbital motion of the electron u.(orb) is the ratio of the 
magnetic moment associated with that motion to its angular momentum. For electron spin, 
the gyromagnetic ration u.(spin) was postulated to be twice that value. Thus, 





f L; p(spin) = —s, (16.1) 


A.(orb) = 5 
Me Me 
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where L and s are the angular momentum vectors associated with the orbital motion and 
electron spin respectively. Then, the Hamiltonian H for an electron in a uniform magnetic 
field B can be written 


H = H + Hı + H + W. (16.2) 
Heisenberg took the various contributions to H to be as follows: 


e The term Ho is the energy associated with the non-relativistic motion ofa spinless electron 
about the nucleus in the absence of an external magnetic field. 
e The second term H represents the interaction energy between the orbital angular mo- 
mentum Ł and spin s of the electron with the external magnetic field B, 
e 





H =>—B (L +25). (16.3) 


e 
e The term H represents the energy associated with the coupling between the induced 
magnetic field B; observed in the frame of reference of the moving electron and the 
magnetic moment of the electron, what is termed spin-orbit coupling. For a single 
electron orbiting a nucleus of charge Ze, the electric field strength at radius r from 
the nucleus is 
Ze 
= mo (16.4) 
and so the induced, or internal, magnetic flux density observed by the electron in its rest 
frame is 





B= vxE Zevuxr) Zeb 1 (16.5) 
— c? |) 4regc2r3  Amegmec? (r? ` 
Hence, the spin—orbit coupling is expected to have the form 
H B al (L-s) (16.6) 
St . ees (are -S), Pi 
z j Arreom2c? \ r? 


where the overline means the value of r~> averaged over the orbit of the electron. Note 
that more generally, when the effects of shielding by other electrons in the atom are taken 
into account, the electric field will not be of inverse-square form but can be written 
dV(r) 
Fa 
where V(r) is the mean radial electrostatic potential. Associated with the interaction 
energy there is a precession of the axis of the magnetic moment of the electron about the 
internal magnetic field, just as in the case of the classical analysis of the Zeeman effect 


(Sec. 4.1), 
ee ae a, (16.8) 
a Me Arregm2c? \r? i j 





E=- (16.7) 





As compared with (4.10), however, (16.8) has included the factor of 2 associated with the 
gyromagnetic ratio of the spin of the electron. Notice also that the precession frequency 
is proportional to the interaction energy, Ris « M2. 
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e The final term H; represents the relativistic corrections to the Hamiltonian which was 
taken from Sommerfeld’s analysis of the classical relativistic hydrogen atom (Sommer- 
feld, 1916a). 


Despite the less than encouraging views of Pauli, in November 1925 Heisenberg set 
about putting the above Hamiltonian through what he referred to as the ‘matrix mill’ in 
order to find the stationary states and line splittings associated with the anomalous Zeeman 
effect. Disappointingly, he almost reproduced Lande’s formula for the anomalous Zeeman 
effect, but the crucial spin-orbit coupling term resulted in a factor of 2 discrepancy from 
Lande’s expression, a result which cast doubt on the whole scheme. 

In early January 1926, Heisenberg became aware of the new operator formulation of 
quantum mechanics of Born and Wiener (1926) which opened up the route for extending 
matrix mechanics to the more general operator formalism. He was then able to re-evaluate 
the problem using action—angle variables, but nonetheless, the stubborn factor of 2 remained, 
causing general disappointment among the proponents of electron spin. 

The solution was, however, at hand thanks to the insight of Llewellyn Thomas who had 
arrived recently at Bohr’s Institute in Copenhagen as a visiting graduate student. Thomas re- 
examined the relativistic transformations between the frame of the electron and the external 
frame and discovered that the expression (16.6) was not the complete story. Thomas was 
aware of the fact that there is an addition kinematic effect associated with the orbital 
motion of a vector, such as the spin vector of the electron, according to the special theory 
of relativity. Thomas was aware of such a precession which had been worked out by de 
Sitter in the context of the relativistic precession of the Moon and which is analysed in 
Eddington’s book The Mathematical Theory of Relativity (Eddington, 1924). As Thomas 
expressed it in his short paper to Nature (Thomas, 1926), 


‘(According to the above argument], the precession of the spin axis . . . is its precession in a 
system of coordinates (2) in which the centre of the electron is momentarily at rest. System 
(2) is obtained from system (1), in which the electron is moving and the nucleus at rest, 
by a Lorentz transformation with the velocity v. If the acceleration of the electron is [a], 
and the system (3) is obtained from system (1) by a Lorentz transformation with velocity 
v + a dt, then the precession which an observer at rest with respect to the nucleus would 
observe, and which should be summed to give the secular precession, is that precession 
which would turn the direction of the spin axis at time f in (2) into its direction at time 
t + dt in (3) if both directions were regarded as directions in (1). To a first approximation 
system (3) is obtained from system (2) by a Lorentz transformation with the velocity a dt 
together with a rotation (1/2c?)[v x a] dt. 


This purely kinematic effect results in an additional contribution to the precession, and 
hence interaction energy, of the electron which amounts to 





Qr = 721% xa], (16.9) 

and since, to a first approximation, mea = e E, it follows that 
Geet (ljr (16.10 
T 24rem2e\r3 l = 
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Thus, the effective internal field is only half the value given by (16.6) and can account com- 
pletely for the discrepant factor of 2. After considerable debate, even Pauli was converted 
and the paper on the quantum mechanical explanation for the anomalous Zeeman effect 
was published by Heisenberg and Jordan in June 1926 (Heisenberg and Jordan, 1926). 
Rechenberg has written in his summary of the history of quanta and quantum mechanics 
that the explanation of the anomalous Zeeman effect was one of the greatest triumphs of 
matrix mechanics (Rechenberg, 1995). 


16.2 Heisenberg and the helium atom 
SSS SSS EE SSS 


Following the departure of Kramers to take up the chair of theoretical physics at Utrecht, 
Heisenberg succeeded to the post of university lecturer and assistant to Bohr in Copenhagen 
in May 1926. It was well-known that the old quantum theory had insuperable problems 
in dealing with atoms with more than one electron and Heisenberg was well aware of 
these, having already worked on the theory of the helium atom on the basis of the Bohr— 
Sommerfeld theory of atomic structure. By 1926, however, the problem was ripe for a 
renewed attack. The discoveries of electron spin by Uhlenbeck and Goudsmit (1925a) 
and Pauli’s exclusion principle (1925) as well as the success with which Heisenberg and 
Jordan had accounted for the anomalous Zeeman effect using the operator enhancement 
of matrix mechanics suggested a new approach to the helium problem. At the same time, 
Heisenberg rapidly assimilated Schrödinger’s wave mechanics which provided a simpler 
means of determining the relevant matrix elements. 

The well-known problem associated with the spectra of helium and the alkaline-earth 
metals, such as magnesium and calcium, was that there are two apparently separate and 
independent sets of lines present in their optical spectra. In the case of helium, the corre- 
sponding separate term diagrams were referred to as belonging to para and orthohelium 
(Fig. 16.1). Both sets of energy levels were not so different from the terms appearing in the 
hydrogen term diagram. The parahelium lines were all singlets whereas the orthohelium 
series consisted of very narrow triplets — the energy levels of the ortho terms are slightly 
more tightly bound than those of the corresponding para terms. Originally, it was thought 
that helium was in fact a mixture of two gases, para and orthohelium. 

According to Goudsmit’s recollections,! he and Bohr discussed the problem of the helium 
spectrum in February 1926 during a visit to Copenhagen, not long after the discovery of 
spin. He recalled that 


‘By looking at the helium spectrum Bohr understood right away that if you turned over 
one of the two spins [of the electrons in the helium atom], the energy suddenly would be 
entirely different.’ 


Goudsmit attempted to explain the large energy difference between the ground states of 
ortho and parahelium in terms of a magnetic interaction between the magnetic moments of 
the two electrons, but could not obtain as large a difference as observed. 
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The term diagram for helium (Herzberg, 1944). The term diagram on the left is for parahelium and is characterised by 
the two electrons having opposite spins and so $ = 0, corresponding to singlet states. The term diagram on the right 
is for orthohelium in which the electrons have parallel spins so that $ = 1 and the states are triplet states. 
Transitions between the singlet and triplet states are forbidden. 


Heisenberg realised that the solution to the problem lay in applying Pauli’s exclusion 
principle to the helium atom (see Sect. 8.4). On a postcard written to Pauli towards the end 
of April 1926, Heisenberg wrote 


‘we have found a rather decisive argument that your exclusion of equivalent orbits [of 
two electrons in an atom] is connected with the singlet-triplet separation. ... Consider 
the energy written as a function of the transition probabilities. Then, a large difference 
results if one... has transition to 1S, or if, according to your ban [exclusion principle], 
one puts them equal to zero. That is, para- and ortho-[helium] do have different energies, 
independently of the interaction between magnets.’ 


This provided the spur to Heisenberg to begin calculating. According to the Pauli exclusion 
principle, two electrons with opposite spins could occupy the ground state and would 
correspond to the parahelium series. On the other hand, if the spins were parallel, one of 
the electrons could remain in the ground state but the other would have to occupy a less 
tightly bound orbit, well above the ground state. Central to the analysis was the fact that 
the two electrons in the helium atom are identical particles and this could not be ignored. 
In the scheme of matrix mechanics, neglecting the weak interaction between the magnetic 
moments of the electrons, Heisenberg adopted a model in which the Hamiltonian matrix H? 
was composed of two terms, H° and HP, each term referring to the motion of the electron 
under the influence of the combined Coulomb field of the nucleus and the two electrons. 
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The charge experienced by an electron in the unshielded Coulomb field of the nucleus 
would be 2e, or e if the nuclear charge were shielded by the other electron. The key point 
realised by Heisenberg was that the exchange of the a and b electrons would leave the 
energies of the stationary states unaltered. 

Heisenberg proceeded with the calculation of the stationary states of the helium atom 
treating the Coulomb repulsion between the electrons as a first-order perturbation. The 
result was the discovery of two separate symmetric and antisymmetric solutions for the 
stationary states, transitions being allowed within each set of separate energy levels, but 
not between them. Initially, Heisenberg identified the symmetric solutions as the ones to 
be adopted and considered this choice to be consistent with Pauli’s exclusion principle 
and with Bose-Einstein statistics. Much later, in an interview in 1963, he confessed to his 
confusion about the nature of the statistics to be associated with the electrons in atoms in 
1926. He stated: 


‘For a long time, I continued to mix up Bose-Einstein and Fermi—Dirac statistics. I did not 
know Fermi-Dirac statistics at that time; I knew only the Pauli exclusion principle. I was 
always confused between Bose-Einstein statistics and the Pauli exclusion principle which 
produce different ways of counting states. When I wrote the equations for two identical 
electrons, there were two solutions, one symmetrical and the other anti-symmetrical. First I 
thought that I had to take the anti-symmetrical solution to obtain the Bose statistics and that 
must be the one that gave the Pauli principle. Later, I saw that it was the other way around. 
One must take the symmetrical solution to get Bose statistics, and the antisymmetrical 
solution to get Pauli’s exclusion principle.’ (Heisenberg, 1963) 


Heisenberg proceeded to carry out the analysis to find the energy levels of the helium atom 
and, in doing so, made use of Schrödinger’s wave mechanics to evaluate the perturbation 
terms. The Hamiltonian for the helium atom can be written 
Pi p? 2e? 2e? e? 


H = + ; 
2m. 2m. Arnerı Areorn 47 E€0r12 





(16.11) 


where rı and r are the distances of the electrons from the nucleus and r12 is the distance 
between the electrons. For the unperturbed helium atom, in which the interaction between 
the electrons as represented by the last term in (16.11) is omitted, Schrödinger’s equation is 
separable and wavefunction solutions of the form y = ¢)¢2 can be found, where ¢) and 
#2, are the wavefunctions for electrons 1 and 2. Heisenberg found that, when he carried out 
the perturbation analysis of the two-electron system including the electrostatic repulsion 
between the electrons, the symmetric and antisymmetric solutions were 

e 1 1,2 1 42 x 1 
y= a Fn Fm Hmp) WOH V2 
Heisenberg ‘used the steamroller’ approach to find the energy levels of the helium atom, 
approximating the radial distribution of the electric potential by the function 


Ooi — On Pa) - (16.12) 


2 





2 for0 <r < ro 
ÅT Ero 


Vir)=- 





+ f(r), where f(r) = 


a 2 (16.13) 





forro <r <. 
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With this approximation, Heisenberg was able to find a reasonable quantitative explanation 
of the energy levels in the para and ortho states of the helium atom. In the second part of 
his paper he included the small additional changes associated with electron spin, which 
included the correct splitting of the singlet and triplet states. Heisenberg’s papers were pub- 
lished in August and October 1926 (Heisenberg, 1926a,b). In the second paper, Heisenberg 
acknowledged the value of Schrédinger’s approach to the evaluation of the energy levels 
and transition probabilities. He wrote 


‘I used Schrödinger’s formalism for help with the mathematics. It was clear to me that 
in order to calculate the shift of the levels in the helium atom, matrix elements were 
needed, and that they could be calculated quite well from Schrédinger’s scheme. Such a 
calculation in matrix mechanics would have been difficult.’ 


This major advance in understanding the properties of the helium atom were refined by a 
number of authors, leading ultimately to an accurate estimate of the ionisation potential of 
the helium atom by Kellner (1927). These calculations were also to lead to the development 
of the concept of exchange forces, the quantum mechanical force associated with the fact that 
two identical electrons cannot occupy the same quantum state. This would ultimately lead to 
Heisenberg’s application of the concept of exchange forces to the theory of ferromagnetism 
(Heisenberg, 1928), but first we need to get to grips with the statistical properties of spin-4 
particles, Fermi—Dirac statistics. 


16.3 FermiDirac statistics — the Fermi approach 
eS > | 


The young Enrico Fermi visited Göttingen and Leiden as a research fellow in 1923-1925 and 
became familiar with the issues of quantum physics. In 1925 he was appointed a lecturer in 
physics at the University of Florence and then to a new professorship of theoretical physics at 
the University of Rome in 1926. His early interest in quantum physics had been in the equa- 
tion of state of an ideal gas, in particular, Nernst’s heat theorem according to which the heat 
capacities of all substances tend to zero as the temperature tends to zero, contrary to the ex- 
pectations of classical statistical mechanics. The papers by Bose and Einstein appeared in 
1924 and 1925 (see Sect. 9.2) and Fermi certainly knew about them (Bose, 1924; Einstein, 
1924, 1925). They had treated light quanta and the atoms of an ideal gas as indistinguishable 
particles and argued that the procedure of allocating these entities to the available states in 
phase space should avoid duplications, provided the particles were indistinguishable. 
Fermi was even more strongly impressed by Pauli’s recent enunciation of the exclusion 
principle for the electrons in the atom (Pauli, 1925), namely, that only one electron can 
occupy a particular quantum state with a given set of atomic quantum numbers (Sect. 8.4). 
In the standard formulation of statistical mechanics, the available states within an enclosure 
are enumerated by counting the available states in momentum space, which is equivalent 
to fitting a finite, but very large, number of wavelengths within its reflecting walls — the 
result is an essentially continuous distribution of energies and momenta of the particles. 
Fermi now proposed reducing the size of the enclosure so that it included only a single atom 
and then applying Pauli’s exclusion principle to that volume. But unlike Pauli, who applied 
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his rule to the electrons in the atom, Fermi proposed applying the same type of exclusion 
principle to the atoms of a perfect gas. Fermi wrote, 


‘... one obtains a degeneracy [of the gas] of the expected order of magnitude, if one 
chooses the vessel to be so small as to contain, on average, only one molecule. ... We 
shall demonstrate . . . that the application of the Pauli rule allows us to present a complete, 
consistent theory of the degeneracy of ideal gases.’ (Fermi, 1926b) 


Fermi’s first publication on his new quantum statistics appeared in Italian in the Rendiconti 
del Reale Accademia Lincei on 7 February 1926 and then in German in the Zeitschrift fiir 
Physik on 11 May 1926 (Fermi, 1926a,b). Fermi’s model consisted of N gas molecules of 
mass m located within a central ‘elastic’ potential U given by the expression 


U = 2n?v?mr?,, (16.14) 


where v is the frequency of oscillation of the molecules about the centre r = 0. Considering 
only the translational motion of the atoms or molecules in the x-, y- and z-directions, the 
energies in the three independent coordinates are quantised so that the energy of each 
molecule is 


w = hv (sı tn+s)=shv, (16.15) 


where s1, s2 and s3 take integral values 0 < s; < s. It is straightforward to show that the 
number of different ways Q, of obtaining s for this range of s1, s2 and s3 is 
(s + 1)(s +2) 
Q, =. 
2 
At absolute zero temperature, all the states would be occupied, with one molecule with zero 
energy, three with energy hv, six with energy 2hv and so on. 

At temperatures greater than absolute zero, the atoms or molecules are distributed among 
the available states and Einstein had shown how this calculation is carried out in his paper 
on the Bose-Einstein distribution (Einstein, 1924, 1925). If there are N molecules and the 
total energy E is to be distributed among them, then 


(16.16) 


Yo Ns =N, and hv sN; =E, (16.17) 


with the requirement that N, < O,. This becomes an exercise in the calculus of variations, 
exactly paralleling Einstein’s analysis. The number of possible ways of distributing N, 
atoms among Q, possible locations is given by the usual permutation formula ($) and 
so the total number of possible ways of distributing the particles is the product of these 
factorials for all the energy states, subject to the constraints (16.17), 


— (20\ (Qı\ (2 
SO. m 


Using Stirling’s formula for large factorials, N! ~ N“, and taking logarithms, the problem 
reduces to finding the most likely value of 


inP= >m( 9") =- (Nin + 01m) , (16.19) 


S 
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subject to the constraints (16.17). Using the standard procedure of the method of undeter- 
mined multipliers, the result is 


Ns Qs 
—— =aexp(—fs) or N = ——————__., 
Q, —N pt-Bs) exp(A + Bs) +1 

where a, ß and A are constants. This result can be compared with Einstein’s expression 


(9.7), 


(16.20) 


&k 


Ser (16.21) 


nk = 


showing the characteristic difference between Fermi-Dirac and Bose-Einstein statistics, 
the plus sign in the denominator of the former and the minus sign of the latter. 

Although the constants A and 6 could be determined from the statistical formula 
S =k InP and the thermodynamic relation T = dS/dS, Fermi was not certain about the 
applicability of the thermodynamic laws at extremely low temperatures and so instead 
derived the constants from the asymptotic value of the particle distribution at very large 
distances from the origin. At these distances, the density of particles becomes very low and 
the energy distribution tends to a Maxwellian velocity distribution. In this way, he found 
that the constant 8 = hv/kT and determined the value of A. Of particular significance was 
his determination of the pressure of the gas in the extreme low-temperature limit, at which 
the pressure is independent of temperature, 


1 (6\°° nn’? 
Dez (=) a (16.22) 





the form of which can be derived from elementary physical arguments.? In addition, there 
is a zero point energy per particle which amounts to 





an 
E= 2 f (16.23) 
T m 
Finally, the specific heat capacity at low temperatures is 
2r? N P mk2T 
Gee =) ge (16.24) 


Thus, Cy becomes zero as the temperature tends to zero, as required by Nernst’s heat 
theorem. 


16.4 Fermi-Dirac statistics — the Dirac approach 
E) 


Meanwhile, back in Cambridge, Dirac continued his personal, rather lonely, agenda for the 
development of quantum mechanics, but profiting from correspondence with his colleagues 
in Göttingen and Copenhagen, particularly with Heisenberg. The latter recognised that 
Dirac’s quantum algebra was a more powerful approach than matrix mechanics. He drew 
Dirac’s attention to Schrödinger’s papers and sought his advice and assistance in reconciling 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:43 GMT 2014. 
http://dx.doi.org/10.1017/CB09781139062060.017 
Cambridge Books Online © Cambridge University Press, 2014 





321 


16.4 Fermi-Dirac statistics — the Dirac approach 


the very different approaches of matrix and wave mechanics. Heisenberg was well aware 
of Dirac’s remarkable mathematical skills and he was not disappointed by the outcome. 
Dirac’s g-number formulation of the rules of quantum mechanics had the advantage that 
he did not need to state explicitly what the g-number was, but could use the formalism 
for whatever mathematical object best suited his purposes for the problem at hand. As he 
stated in the introduction to his paper On the theory of quantum mechanics (Dirac, 1926f), 


‘One can build up a theory [of atomic systems] without knowing anything about the 
dynamical variables except the algebraic laws that they are subject to, and can show that 
they may be represented by matrices whenever a set of uniformising variables for the 
dynamical system exists. . . . It can be shown however... that there is no set of uniformis- 
ing variables for a system containing more than one electron, so that the theory cannot 
progress very far on these lines.’ 


In his letters to Dirac, Heisenberg described the reconciliation of matrix and wave 
mechanics and Dirac fully appreciated the advantages of Schrédinger’s theory. Character- 
istically, he set about a systematic exploration of wave mechanics, recasting it in his own 
language.” The g-numbers of his theory could be taken to be differential operators, the 
pairs of canonically conjugate position and momentum variables q, and p, and time and 
energy variables ¢ and W being defined by 


im (16.25) 
ro r = 17T, $ 
a 4 2x ðq, 
.h ð 
t, W,=-i——. (16.26) 
2x ðt 


Furthermore, he showed that with these definitions, the classical Hamilton-Jacobi equation 
reduces to Schrödinger’s wave equation and so the whole apparatus of wave mechanics fol- 
lowed naturally from his scheme of quantum mechanics. The relation between Schrödinger’s 
eigenfunctions and the elements of the matrices was also clarified: 


‘...any constant of integration of the dynamical system. ..can be represented by a 
matrix whose elements are constants, there being one row and one column of the matrix 
corresponding to each eigenfunction yn. 


Dirac’s clear and concise exposition of the fundamentals of quantum mechanics provided a 
complete demonstration of the equivalence and merits of the different approaches of matrix 
and wave mechanics. He had no hesitation in adapting Schrédinger’s approach to his way 
of thinking. 

Dirac had a long-standing interest in developing the quantum mechanics of multiple- 
electron systems and in Sect.3 of his paper concentrated upon the case of the helium 
atom and then the statistics of identical particles (Dirac, 1926f). The nub of his thinking is 
contained in the following quotation from his paper. 


“Consider now a system that contains two or more similar particles, say, for definiteness, 
an atom with two electrons. Denote by (mn) that state of the atom in which one electron 
is in an orbit labelled m, and the other in the orbit n. The question arises whether the two 
states (mn) and (nm), which are physically indistinguishable as they differ only by the 
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interchange of the two electrons, are to be counted as two different states or as only one? 
If the first alternative is right, then the theory would enable one to calculate the intensities 
due to the two transitions (mn) — (m’n’) and (mn) — (n'm’) separately, as the amplitude 
corresponding to either would be given by a definite element in the matrix representing 
the total polarisation. The two transitions are, however, physically indistinguishable, and 
only the sum of the intensities for the two together could be determined experimentally. 
Hence, in order to keep the essential characteristic of the theory that it shall enable one to 
calculate only observable quantities, one must adopt the second alternative that (mn) and 
(nm) count as one state.’ 


This paragraph encapsulates the key concept in his development of what was to become 
Fermi-Dirac statistics. 

This requirement of the indistinguishability of particles led to inconsistencies in a pure 
matrix mechanical scheme, but it had a straightforward interpretation in terms of the 
eigenfunction representation of the stationary states of the helium atom. Just as in the case 
of Heisenberg’s analysis of the helium atom, if the interaction term between the electrons is 
ignored, the Hamiltonian (16.11) is separable so that the wavefunction of the two electrons 
can be written 


Wn (1, Yi; Zi, t) Prz, Y2, Z2, t) = Wm CQ) Wn(2) + (16.27) 


The eigenfunction y,,(2)W,(1), however, corresponds to exactly the same identical state. 
There must therefore be only one row and column of the matrices corresponding to both 
(mn) and (nm) and this is achieved by writing the wavefunction Ymn in the form 


Vn = Amn Ym) Wn(2) + bin Yn (2) Wn) , (16.28) 


where the amns and bams are constants, the sets containing only one Ymn corresponding to 
both (mn) and (nm). Furthermore, the amns and bams must be chosen so that the matrix 
can represent any symmetric function A of the two electrons. Thus, it must be possible to 
expand the wavefunction w,,, in terms of the complete set of eigenfunctions so that 


Ann = 5 Winn! Am'n',mn ’ (16.29) 
m'n’ 
where the Am’n',mn are constants or only functions of time. 

There are then two ways of choosing the set of functions Ymn to satisfy these conditions. 
Either ayn = bmn, resulting in Ymn being a symmetrical function of the two electrons and so 
all the eigenfunctions are symmetrical, Or ayn = —bmn, SO that Ymn is an antisymmetrical 
function of the two electrons and all the eigenfunctions are antisymmetrical. Dirac concludes 


‘An antisymmetrical eigenfunction vanishes identically when two of the electrons are in 
the same orbit. This means that in the solution of the problem with antisymmetric eigen- 
functions there can be no stationary states with two or more electrons in the same orbit, 
which is just Pauli’s exclusion principle. The solution with symmetrical eigenfunctions, 
on the other hand, allows any number of electrons to be in the same orbit, so that this 
solution cannot be the correct one for the problem of electrons in an atom.’ 


In the next section of this very impressive paper, Dirac makes clear the application of 
the symmetrical and antisymmetrical wavefunctions to light and particles. 
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‘The solution with symmetrical eigenfunctions must be the correct one when applied 
to light quanta, since it is known that the Einstein—Bose statistical mechanics leads to 
Planck’s law of black-body radiation. The solution with antisymmetrical eigenfunctions, 
though, is probably the correct one for gas molecules, since it is known to be the correct 
one for electrons in an atom, and one would expect molecules to resemble electrons more 
closely than light-quanta.’ 


Dirac proceeds to determine the equation of state of the gas, under the assumption that 
there can only be one or zero electrons associated with a given wave, or eigenfunction. The 
waves are divided into a number of sets A,, each of the same energy E,. Then, the number 
of different ways in which N, molecules can be distributed among the A, waves is given 
by the standard relation 


As} 
u Ng\(As ~ Ns)! 
the factorials in the denominator removing duplicates of identical occupancy or vacancies. 


The total probability is then found by multiplying the probabilities of all possible states, 
subject to the constraints N = )>, N, and E =), E,N,. Dirac writes 
As! 

W = aa, = N! : (16.31) 
This has reduced the problem to the same standard calculation in the calculus of variations 
which was carried out by Fermi (1926a,b) and discussed in Sect. 16.3. The result is the 
standard Fermi—Dirac distribution, which in Dirac’s notation is 

As 

ett /kT +1 d 


Ps ’ (16.30) 


N, = (16.32) 


exactly the same as Fermi’s expression (16.20). The expressions for the total numbers of 
particles and their pressure in volume V follow by the standard procedure for counting the 
numbers of states in phase space, 








QnV(2m)3/2 f° Ey? dE, 

N= IN > h3 Í e2+tEs/kT 41’? (16.33) 
2r V (2mp f” EP dE, 

E=) EN, = 7 f -TTi (16.34) 





Dirac noted that, since pV = 3E , by eliminating œ between (16.33) and (16.34), the 
equation of state of the gas could be found. He noted also that the specific heat of the 
gas tended steadily to zero at zero temperature, as required by Nernst’s heat theorem. 
The phenomenon of Bose-Einstein condensation does not occur. 

Much later in an interview which took place in 1963, Dirac recalled that he had seen 
Fermi’s paper, but had not paid much attention to it. Only when Fermi wrote to him about 
it, Dirac recalled the paper and fully acknowledged the fact that Fermi had independently 
discovered what is now referred to as the Fermi-Dirac distribution. While there is no 
doubt about the priority of Fermi’s pioneering paper, Dirac’s had the greater impact since it 
addressed the issue of quantum statistics in greater generality and was quickly assimilated 
into the structure of quantum mechanics. 
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16.5 Building spin into quantum mechanics - Pauli spin matrices 
u un hän .x-,z,-—„-->SS>|SL®: zz: —. 


Despite Heisenberg and Jordan’s success in accounting for the anomalous Zeeman effect 
(Sect. 16.1), the analysis was only a quantum mechanical extension of the classical expres- 
sion for a spinning electron in a magnetic field. The spin was represented by a spin vector 
s which was added vectorially to the orbital angular momentum L. Pauli fully appreciated 
that spin had not been properly incorporated into the scheme of quantum mechanics for 
a simple reason. Whilst called an ‘angular momentum’, spin could not be represented in 
the same way as orbital angular momentum because the orbital angular moment has three 
independent directions, corresponding to rotation about the x-, y- and z-axes, M,, M, and 
M., whereas the postulates of Uhlenbeck and Goudsmit required only two spin states asso- 
ciated with s = +3 ands = — L, The full significance of this difficulty was appreciated by 
Darwin (1927a) who wrote, 


‘When what is required is to double the number of states of the electron, it is at the least 
generous to introduce three extra degrees of freedom and then make an arbitrary (though 
not unnatural) assumption which cuts down the triple infinity to two. The electron is in 
fact given a complete outfit of Eulerian angles, even if it may not be necessary so to 
express the matter explicitly. Now we regard the electron as the most primitive thing in 
Nature, and it would therefore be much more satisfactory if the duality could be obtained 
without such great elaboration.’ 


Darwin elaborated his proposal to treat what Pauli called the electron’s ‘classically not- 
describable two-valuedness’ as a vector quantity which involved two separate Schrödinger- 
type equations to describe the different polarisations (Darwin, 1927b,c). He doubled the 
number of equations so that the four new variables could be treated as the components of 
a four-vector in relativity. The scheme was complex and incomplete and was overtaken by 
Pauli’s invention of spin matrices which could be incorporated into quantum mechanics, 
but this involved a quite different approach to the problem. The essence of what Pauli 
(1927b) did is pleasantly summarised in the exposition by Lindsay and Margenau (1957) 
who described the problem as follows: 


‘. . . the operator corresponding to this new degree of freedom, which we shall metaphor- 
ically continue to call ‘spin’, must have only two eigenvalues corresponding to the values 
+h/4n of angular momentum in any one direction in which the external field is imagined 
to be placed.... To make a long story short, it is found very difficult to obtain a differ- 
ential operator... which, acting upon a function of a continuous variable, has only two 
eigenvalues.’ 


Pauli’s approach was quite different from that previously used in quantum mechanics — 
he chose a spin variable which has finite values at only two points. The variable s measures 
the spin in, say, the z-direction. The associated spin state function is &(s). Now we suppose 
the range of s is only the two points 1 and —1. It is convenient for visualisation purposes to 
think of s as the cosine of the angle between the electron’s spin axis and some direction in 
space, so that s = +1 corresponds to alignment parallel to the chosen direction ands = —1 
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to the antiparallel alignment. The state function ¢(s) can only have finite values a at s + 1 
and b at s = —1 and so the most general function of &(s) is 


P(s) = 4654145851, (16.35) 


where the ds are delta functions. The state function has to be normalised to unity and so 


[ow o(s)ds = [oe ôs +1ôs,+1 + a*b ôs 418s, -1 + b*a ôs —1ôs, +1 + b*b ôs -155,-1) ds, 
(16.36) 

=a*řa+b*b=1. (16.37) 

Interpreting ¢*(s)ġ(s) as the probability that the system is found in the state s, we can 
find the state function, even without knowing the operator. If the eigenfunction describing 


the system as being certainly in state s = +1 is w,(s), then wi(s) Wi(s) = ôs,+1. Now 
substituting this result into (16.35), it follows that a = 1, b = 0 and so 


W+(s) = bs,41- (16.38) 
Similarly, if the system is certainly in the state s = —1, the eigenfunction is 
w_(s) = 45-1. (16.39) 


For the purposes of the spin properties of the electron, (16.38) and (16.39) form a complete 
orthonormal set. 

We can now determine directly the spin operators oy, oy, oz. Suppose we wish to deter- 
mine the spin angular momentum about the z-axis o,. Then, by hypothesis, the eigenvalues 
are +h/4z while the eigenfunctions are given by (16.38) and (16.39). If we write for 
economy 





h 
Ox, y,2 = ig ee ’ (16.40) 





Sy,y,z has eigenvalues +1. Then, the operator S, must satisfy the equations 


SWe=te, Sy- =y. (16.41) 


This is a key point in the argument. S, cannot be a differential operator because the functions 
w are not differentiable. Rather (16.41) defines the operator S+. When it operates on y4, it 
leaves it unchanged; when it operates on w_, it changes the sign of w_. 

It is convenient to regard S, as a matrix. Then, Y, and w_ can be written as a column 
vector y, 


V+ 
= ; 16.42 
y | d (16.42) 


so that we can write 


Sy=s.y~ with S: = E >| : (16.43) 


To find the spin operators corresponding to Sy and S,, we assume that the operator 
equivalents of the matrix equations for the relations between the components of the angular 
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momentum (12.83) are applicable and so, for example, 
h 
271 


Using the notation (16.40), the relations between the spin operators become 


OyOx — 0,0, = 


ER (16.44) 


Sy Sy — Sy Sy = iS, 
ei, (16.45) 
S Sk — Sk S- al. 


Given (16.43), these equations can be readily solved to find what are referred to as the Pauli 


spin matrices, 
1 0 0 —i 0 1l 
seli g ; sel ‘ll ; s=[ 9) i . (16.46) 


Notice a number of important features of these matrices. Only S, is diagonal and so 
determines the eigenvalues of the spin operator S,. The others are not diagonal, as must be 
the case since diagonal matrices commute and (16.45) shows that the spin matrices do not 
commute. Notice that the spin matrices are simply a convenient shorthand for the following 
operator equations: 


oo \ ea | wen 
Sw =-v_J Sy- =i} Sp Wo = V4. 


It is apparent that y and w_ are eigenstates only for the operator S,, while the operators 
S, and Sy represent mixed states of the ys. 

The spin operators can now be used in Schrödinger’s equation following the usual rules 
of replacing variables by operators. Consider, for example, the simplest case of an electron 
in a uniform magnetic field B. The energy equation is Hy = Ey and we replace the spin 
terms in the Hamiltonian H by spin operators. The energy term involving the interaction 
between the magnetic moment of the electron u and the magnetic field is 


(16.47) 


H = u-B: + by By + ux Bx . (16.48) 


Using the vector relation u = (he/mec)s, the corresponding operator for the magnetic 
moment of the electron is (he/m.c)S and so Schrédinger’s equation becomes 


eh 





(S,B, + SyB, + SB) Y = Ey. (16.49) 


Anmec 


In the simplest case in which the magnetic field is oriented in the z-direction, (16.49) 
becomes 
eh 


Anmec 





SBw=Ew, or |wIBSw=Ev. (16.50) 


w is given by the column vector (16.42) which then provides the two solutions for the 
magnetic moment being parallel and antiparallel to the magnetic field direction? with 
energies +|u|B and —|u|B. Thus, the spin operators fit naturally into the scheme of 
the operator formalism, although they are not differential operators. Notice that the spin 
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operator has the property that it converts Schrédinger’s equation directly into an algebraic 
equation, rather than a differential equation. 

The extension to incorporate spin into atomic physics follows by exactly the same 
formulation as in the case of the electron in a uniform magnetic field. For example, for 
the case of an electron in an atom in the presence of a uniform magnetic field, we can 
write Hw = Ew with the Hamiltonian now consisting of two parts, Ho associated with the 
operator which appears in Schrédinger’s equation for the atom and a second term which 
represents the interaction of the electron with the magnetic field, 


H = Ho + |u| BS: . (16.51) 


Suppose we represent the wavefunction by the product w(q)w(s), where q stands for the 
three space coordinates and s for the spin coordinates. Ho operates only on the space 
coordinates and y(q) is the resulting standard solution of the wave equation. S, operates 
only on the spin part of the wavefunction Y(s). Consequently, by separation of variables, 
the solution is found by solving separately the equations 


Hov(q) = Eqvq), |vulBs.w(s) = Esw(s), (16.52) 


with E = E; + E,. For any given stationary state, there are two solutions associated with 
the spin of the electron with energies +|u|B relative to the case in which there is no 
magnetic field. There is another pleasant consequence of this result. If the magnetic field 
is decreased, the two spin states will tend towards degeneracy as B — 0 and indeed the 
two spin states are degenerate in the limit of zero field. In fact, however, because of the 
effects of spin-orbit coupling, the electron experiences a magnetic field, as discussed in 
Sect. 16.1. Therefore, in practice, the degeneracy is lifted and this coupling accounts for 
the fine splitting of the energy levels of the atom. 

The important result of these calculations was that Pauli had succeeded in incorporating 
spin into the scheme of quantum mechanics. This achievement had profound implications 
for future studies of the quantum properties of matter since the way was opened up for fully 
incorporating spin into many different types of quantum mechanical problem, in particular, 
for systems of more than one electron. These studies led to the development of a self- 
consistent scheme of spin in quantum mechanics and was able to explain a large number 
of phenomena in atomic and condensed matter physics. There was, however, no deeper 
understanding of the origin of the spin of the electron — what Pauli had achieved was to find 
the correct formalism for including spin in quantum calculations, once it is assumed that 
the electron is a spin-half particle. The solution came with Dirac’s relativistic formulation 
of quantum mechanics. 





16.6 The Dirac equation and the theory of the electron 
SSS Ey 


From the outset, there were difficulties in incorporating relativity into quantum mechanics. 
Sommerfeld’s tour de force in using relativity to account for the splitting of the hydrogen 
lines according to the old quantum theory was a remarkable feat, but it was recognised 
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by the pioneers of quantum mechanics that building relativity into quantum and wave 
mechanics was non-trivial. We recall that Schrödinger had begun with a fully relativistic 
version of his wave equation which he had to abandon because it resulted in incorrect 
energy levels for the hydrogen atom. With the undoubted successes of matrix and wave 
mechanics and the incorporation of spin into the formalism of quantum mechanics, the 
need to find a route to a relativistic version of Schrédinger’s equation was pressing. The 
hero of this story is undoubtedly Dirac who continued to pursue his own individual agenda 
for the development of a fully relativistic theory of quantum mechanics. He achieved this 
goal in a pair of remarkable papers published in 1928 in which he generalised Schrédinger’s 
equation, resulting in the linear Dirac equation (Dirac, 1928a,b). 

There is no question about Pauli’s priority in the discovery of the spin matrices, which 
he had discussed with Heisenberg in November 1926. Dirac worked independently on the 
problem of electron spin and derived his own version of the spin matrices during the early 
months of 1927. He was certainly aware of Pauli’s paper during the summer of 1928 when he 
prepared a set of Lectures on Modern Quantum Mechanics for delivery in the Michaelmas 
and Lent terms of the 1927-1928 Cambridge academic year. As discussed in Sect. 13.3, 
Dirac was also aware that the set of 2 x 2 matrices described by Baker in his Principles of 
Geometry (1922) were g-numbers and so would fit naturally into his approach to quantum 
mechanics.° According to Dirac’s reminiscences, he was not interested in incorporating 
spin into a relativistic theory of quantum mechanics when he developed that theory in the 
autumn of 1927. In his words, 


‘I was not interested in bringing the spin of the electron into the wave equation, did not 
consider the question at all and did not make any use of Pauli’s work. The reason for this 
is that my dominating interest was to get a relativistic theory agreeing with my general 
physical interpretation and transformation theory. I thought that this problem should first 
be solved in the simplest possible case, which was presumably a spinless particle, and only 
after that should one go on to consider how to bring in spin. It was a great surprise to me 
when I later on discovered that the simplest possible case did involve a spin.’ (Dirac, 1977) 


The key role played by Pauli’s introduction of spin matrices is vividly portrayed by Kronig 
who wrote in the Pauli memorial volume (Kronig, 1960), 


‘In the formulation of quantum mechanics reported so far, the material particles are 
considered as characterised by position coordinates only. It now became an urgent task 
to incorporate the phenomenon indicated by the term ‘spin’ into the new scheme of 
things. The initial step was taken by Pauli himself [(1927b)], who proposed that the 
wave function of the electron, besides being determined by continuously variable position 
coordinates, should be considered as also depending on a spin variable, capable of two 
values only. . . . Thereby Pauli paved the way for the relativistic theory of the electron and 
of hydrogen-like atoms which we owe to Dirac (1928a).’ 


Pauli’s advance was to incorporate the spin of the electron consistently into the scheme of 
quantum mechanics by introducing a state vector w(s), which as demonstrated by (16.42), 
has two components. As van der Waerden remarked, 


‘The step from one to two components is large, whereas the step from two to four 
components is small.’ (van der Waerden, 1960) 
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16.6.1 The Dirac equation of a free electron 


Let us outline how the Dirac equation can be formulated without going into too many 
technical details. The objective is to find an operator formalism for the relativistic equivalent 
of Schrédinger’s equation which reduces correctly to the standard equation in the non- 
relativistic limit. The requirement of the relativistic theory must be that the equations are 
invariant under Lorentz transformations. Schrédinger, Oskar Klein and Gordon had all 
carried what might be regarded as the ‘natural extension’ of the operator formalism of 
quantum mechanics to the relativistic Lagrangian of an electron in an electromagnetic 
field. 
Let us consider first the case of a free particle, for which the energy equation is 


P 


2me 





=E. (16.53) 


To find Schrödinger’s equation, we replace p and E by the differential operators 


H EL (16.54) 
Zn Be, : 
P Qn 2x ðt 


Schrödinger’s time-dependent wave equation follows immediately: 


he Vy = A 16.55) 
87m. Qn ðt ` (16. 





In the case of a free relativistic particle, the relation between momentum and energy, found 
by equating the norms of the momentum four-vector of the electron in the external frame 
and the rest frame of the electron, is given by the expression 


Pe+md=E. (16.56) 
Using the definitions (16.54), this expression becomes 


2.2 2 92 
= Te v?y +mciy = ai . (16.57) 
This is the simplest form of the Klein—Gordon equation, derived independently by 
Schrödinger, Klein and Gordon. As is apparent, it can only be used for particles with 
zero spin. As Schrödinger had found in his earliest searches for a wave equation to ac- 
commodate de Broglie waves, the solutions give the wrong answers for the energy levels 
of the hydrogen atom. Notice also the issue that the equation involves the second deriva- 
tive of the wavefunction with respect to time, whereas one of the great virtues of the 
standard Schrödinger equation is that it is linear, involving only the first derivative of the 
wavefunction with respect to time. 
An alternative approach would be to express (16.56) in terms of E rather than E?, in 
which case we would write 


ha h? 2 
E= JPP + m2c* and so z Yy +micty. (16.58) 
In ðt An? 


This approach runs into difficulties because the spatial differential operators on the right- 
hand side of the equation are inside the square root. Dirac had profound misgivings about 
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this approach to discovering the correct relativistic formulation of Schrédinger’s equation. 
Specifically, he found that, although the Klein-Gordon equation could reproduce the same 
results for the position coordinates of a particle as in the non-relativistic case, this was not 
the case for dynamical variables such as momentum and angular momentum. Dirac was con- 
vinced that his transformation theory approach to fundamental quantum processes should 
underlie a proper relativistic formulation of quantum mechanics and this had consequences 
for how the translation from relativistic mechanics to relativistic quantum mechanics should 
be carried out. He stated in his great paper of 1928, 


“The general interpretation of non-relativity quantum mechanics is based on transforma- 
tion theory, and is made possible by the wave equation being of the form 


(H-Wy=0, (16.59) 


that is, being linear in W or 0/dt, so that the wave function at any time determines the 
wave function at any later time. The wave equation of the relativity theory must also be 
linear in W ifthe general interpretation is to be possible.’ (Dirac, 1928a) 


Dirac’s approach to the laws of quantum mechanics through the use of g-numbers had 
the great advantage that he did not need to specify exactly what the g-numbers were — 
what was important was that, whatever they were, the non-commutative properties ensured 
that the non-commutation present at the heart of quantum mechanics would be preserved. 
Specifically, the wavefunctions involved need not be scalar functions of a single variable, 
but could be vectors, matrices, tensors, ... Already, he and Pauli had shown that the spin 
matrices provided the solution to the problem of incorporating spin into non-relativistic 
quantum mechanics and so he was in the ideal position to extend these insights into the 
development of the relativistic theory of the electron. 

In his own words, he found the solution by ‘playing around with mathematics’. The 
specific insight came from his considerations of finding relativistically invariant quantities 
which could be built into his scheme of quantum mechanics. He needed the equivalent 
of four-vectors, for which the norm, or the sum of squares of the four components of the 
four-vector, are invariants. Again, in his words, 


‘It took me quite a while, studying over this dilemma, before I suddenly realised that there 
was no need to stick to the quantities o [the spin matrices], which can be represented 
by matrices with just two rows and columns. ... Why not go over to four rows and four 
columns? Mathematically, there was no objection to this at all. Replacing the o-matrices 
by four-row and column matrices, one could easily take the square root of the sum of four 
squares, or even five squares if one wanted to.’ (Dirac, 1977) 


Dirac worked out the new relativistic theory of quantum mechanics of the electron over 
the 1927 Christmas break, using the insights which again must have been provoked by his 
close understanding of Baker’s Principles of Geometry (1922). On page 69, Baker writes: 
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‘Another symbol with the same laws of combination [non-commutation] ...may be 


represented by 
ô, Yy =; =P 
=Y ô, b, a > 
ee <pl* (16.60) 


B, a, Ys ô 


To illustrate what is involved in the derivation of the Dirac equation and its consequences, 
we follow the presentation by Lindsay and Margenau (1957). There is an alternative way of 
writing the energy of a free electron in special relativity. Let us write down the four-vectors 
associated with the momentum and velocity of the electron, P and U respectively: 


E 


where y = (1 — v?/c?)-V/? is the Lorentz factor.’ The scalar product of the four-vectors P 
and U is an invariant and so 


P-U=yE-y(p-v)= constant = m,c’ , (16.62) 


since P- U = mec? in the rest frame of the electron. Note that p and v are the relativistic 
three-momentum, p = ymev, and three-velocity v of the electron. Therefore, since H = E, 
the Hamiltonian of the relativistic electron can also be written 


mec? My 


H=p:v+ = p-v+,|1- —m.c“. (16.63) 
c 





This is now a first-order expression for H. 
We now make the following translation into the language of operators: 





P> =— V ; Uy > CQx 5 Uy — Cd, ; Vz > CQ; ; EEE 
271 = > C2 
h G) 0 ð 
H = —c | ay — + a, — +a, + amec? . 16.64 
ni (« 0x = oy k =) ee ( ) 


The momentum operator is the same type of differential operator as before, but the nature 
of the a operators is undefined — it is convenient to call them ‘symbols’ using Baker’s 
language, the nature of which have yet to be determined. Their algebraic properties can be 
found by applying the operator (16.64) twice and then comparing with the results of the 
operator substitutions which resulted in the Klein—Gordon equation (16.57). These are two 
different ways of describing H?. Therefore, we require 


h aay UN Ira DEE OV ; 
cla, i 2 a4m.c’|- -C la; a; a4Mec 
Imi Ox ay az 7 ™ Imi Ox | ay Faz) 


h? 2 
= L Vym. (16.65) 
An? 
Now, the expectation is that the as will correspond to something akin to the spin ma- 


trices, which are independent of the coordinates x, y, z, and so they will commute with 
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0/dx, 0/dy, 9/dz, so that, for example, (a, 0/dx — 0/dx a,) = 0. Then, multiplying out 
the scalar product in (16.65) and preserving the order in which the operators are applied, 
we find 











he? 2 ə? 2 ə? 2 a? 

4r? E 9°x a 3?y H z| 

he? ð ə ð ə ð ə 

— 2 [eas + yy) ay + (aya, + azos) 3z + (aa, + Os) | 
(16.66) 

hmec? ð 0 ð 2.24 

ri c & + i, 4) + (aa, + Ba + (a4; + aca + ag mec 
== rey + m2c* ; 


It immediately follows that we can find the commutation relations which the œ symbols 
must obey. Specifically, we require 





a? = a? = o? So; =l, 
(QQ + 0,0.) = (Aya, + 0,0,) = (0,0, + Ax) = 0, (16.67) 


(040; + 0,04) = (a4a, + a,04) = (X44: + O04) = 0. 


These results tell us that, applying the individual œ operators twice, the result is the iden- 
tity operator, that is, simply multiplying by one. Secondly, notice that the operators a 
anti-commute — in other words, for commuting variables ab — ba = 0, for anti-commuting 
variables ab + ab = 0. Thus, returning to the Hamiltonian equation (16.63) and the trans- 
lations (16.64), the energy equation for y becomes 





he a ð a 2 
- | ax — +a,— +a, +aum.c |v=EV, (16.68) 
271 Ox “Oy Oz 


where the properties of the symbols «,, œ, œz, a4 are determined by the commutation 
relations (16.67). Equation (16.68) is Dirac’ equation for a free electron. 

Pauli’s spin matrices are 2 x 2 matrices which describe the two spin states of the electron. 
Now we have four as with the commutation properties (16.67). The simplest way of 
accommodating these properties is by 4 x 4 matrices which were derived by Dirac. It will 
not prove necessary to write these out for our present purposes. What is important is that 
the wavefunction y must also be a matrix and the simplest form it can have is as a column 
matrix with four elements, 


yi 
_ 1% 
v=|%]. (16.69) 


Wa 


Thus, the Dirac equation (16.68) is actually four equations, one for each component of y. 
To normalise the wavefunctions, we introduce the row vector y*, with components which 
are the complex conjugates of y. Thus, Y* = [wi, Wy, Wz, wi] and so the normalisation 
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condition is 
I (Yiyi + yy + y3 ys + Wea) dt =1. (16.70) 


Let us now work out the energies of the eigenstates of a free electron according to Dirac’s 
equation (16.68). For simplicity consider only motion in the x-direction. Then, (16.68) 
reduces to 


hce d 
(E — um.) — Sa =0. (16.71) 


We note that (16.71) is a matrix equation and so E means E times the unit matrix, œ4 is 
a 4 x 4 matrix and the components of dy/dx are (dy /dx, dy2/dx, dw3/dx, dy4/dx). 
The solution of (16.71) is an exponential function which we can write 


201 
w = Aexp (Fx) ; (16.72) 


where p is a number which will turn out to be the relativistic three-momentum of the 
electron and A is a column matrix with elements (A|, 42, 43, As). Now substituting this 
solution back into (16.71), we find 


(E — Mec? &4 — cpa) 4 =0. (16.73) 


Now, we can operate on both sides of this equation with the same operator to the left of A 
so that 


(E — mec? a4 — cpæx): (E — mec? a4 — cpa,)A=0. (16.74) 


Recalling that we need to preserve the order of the operators and that E is a diagonal matrix 
multiplied by E, we find 


[2° + mecha; + epa — 2E(mc?a4 + cpay) + pme («4d + at. 014) A=0. 


(16.75) 
Now, the last term in round brackets a4q, + @,a4 in the left-hand side of this expression 
is zero because of the commutation relations and the second term in round brackets is E 
because of (16.73). Therefore, since a? = aj = 1, 


(-E?+mic!+c’*p*?)A=0. (16.76) 
Notice that what has happened is that the operators associated with each of the terms is 


a unit matrix and so the terms in round brackets form an algebraic expression. Since A 
cannot vanish, it follows that 


E=me+cp, (16.77) 
and so the energy eigenvalues associated with the Dirac equation are 


=+,/m2ct+c?p?. (16.78) 


We can demonstrate that p is the average momentum of the electron using Schrödinger’s 
prescription and the solution (16.72) of the Dirac equation. The momentum operator is 
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(h/2z1) 0/dx and so, 


[vs ae PEZOS 
J yy dx PTA Ad 


The energies of the eigenstates of the electron according to the Dirac equation are one of 
its most remarkable features. According to classical physics, the negative energy solution 
can be discarded as meaningless. But this is not the case in quantum mechanics which 
allows discontinuous transitions between the discrete energy levels which are the solutions 
of the wave equation. These negative energy solutions were a paradox, to which we will 
return. 

Our presentation contains the elements of what Dirac set out in the second section of his 
paper. In Sect. 3, he demonstrated that the formalism is fully Lorentz invariant. The real 
triumph of the theory came, however, in Sect.4 of the paper in which the motion of an 
electron in electric and magnetic fields is discussed. 





(16.79) 


16.6.2 The Dirac equation for an arbitrary electromagnetic field 


In compliance with Dirac’s dictum that the equations of classical mechanics are not at fault, 
simply the interpretation of the algebra, the natural extension of the scheme is to use the 
classical expression for the Hamiltonian in electromagnetic fields, but now armed with 
the new precepts of relativistic quantum mechanics. According to classical Hamiltonian 
mechanics we need to make the replacements 


p>p+:A ; H>H+ed, (16.80) 
Cc 


where A is the vector potential and & the scalar potential. It will be recalled that 
[¢/c, Ax, Ay, Az] forms the electromagnetic four-potential. Then, (16.68) becomes 


Hy = ur Ar) + 
=s ri dx ray e? 


h a e 
+ a, (eee E 


271 Oz 
=Ey. (16.81) 


As, before, (16.81) is a set of four differential equations for Y1, Y2, Y3 and Ws. In the 
case of the hydrogen atom, the potential is given by the Coulomb formula, & = Ze/4reor, 
and there is no magnetic field, A = 0. The solutions of the equations for the hydrogen 
atom agreed precisely with the experimentally measured values, including the details of the 
fine structure of the energy levels, as demonstrated in the fourth edition of Sommerfeld’s 
Atombau und Spektrallinien, which included additional material on Schrédinger’s wave 
mechanics and Dirac’s papers of 1928 (Sommerfeld, 1929). Note that because the 4 x 4 
matrices automatically take account of the spin of the electron and the new formulation is 
fully relativistic, the fine-structure calculations include all the effects of spin and relativity. 
Sommerfeld’s fine-structure constant appears naturally in these calculations. 
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Let us simplify the notation slightly. We can write 
P=p+-A ; H'=H+eg, (16.82) 
and then (16.81) can be written in more compact form as 
(ca - p' +a4mc’—-ed)y = Ey. (16.83) 


This has been written like a vector equation, but it will be understood that the components 
of the ‘vector’ œ are the 4 x 4 æ matrices and the equation is in fact a matrix equation. If 
we add ed to both sides of (16.81), we obtain 


Hw=E', (16.84) 


where E’ = E + ¢ — notice that this is no longer an eigenfunction equation since E’ is no 
longer a constant. 

We now need to carry out some straightforward manipulation of the equations to reduce 
the energy equation to a form like that of Schrödinger’s equation. First, we pre-multiply 
(16.84) by E’ so that, 


E'H'y=E°y. (16.85) 


We can rewrite the left-hand side of (16.85) as H'E'y + (E'H' — H'E’)w and then, 
because of (16.84), the first term of this expression is H’. Therefore, 


[H? + (E'H' — H'E')|W = E?y. (16.86) 


Notice that the operator H’ is just the expression in the first equality of (16.81) or (16.83) 
without the term —ed. The objective is now to evaluate the terms in square brackets in 
(16.86). 

Let us deal first with the term H’y. We can write this term as 


H? wy = Pax p+ yp, + ap, + agmc* [ory p+ æy py + &z p, + a4mc?]. (16.87) 


We use the commutation properties of the a matrices (16.67) and the p’ operators to reduce 
(16.87) to the expression 


Ay = [P [p° + a,a,(pl,p/, — p\,p\) + ayap, p — PLP) + @:ax(plp\ — pp’)] 
+m}. (16.88) 
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Next, we need to evaluate the terms such as (p; pi, — p',p;.)W in (16.88) by expanding the 
p’ operators as p = p + (e/c)A. 


(pp, - p\,p.)b = RAR se eg y 
Bale Pelee =N rios a 2riðy c” 


h ð e hd e 
Ay Ay 
(ot! (+: JY 


— he | a ow ð ow 
~ Qwic Focus ee oy ay Av) a | 


he (dA, 84\ h 
*)=-“ Wx A, ¥. 16.89 
2ric ( əx dy ) ae nn 

















But the magnetic flux density B = V x A and so 





(PD, Py - P, PoV = Bow. (16.90) 


27 ic 


The corresponding expressions for (p! p, — p;p) and (p! pi — p,p}) are found by cyclic 
permutation of the indices x, y, z 
To complete the reduction of (16.88), let us introduce a new set of matrices defined by 


0z = iAy, Oy = —10,0,, Oy = —idya,. (16.91) 
Then, (16.88) can be written in the simple form 
h 
H’?y = [e (r pig B) + mèe*| Y. (16.92) 
2c 
Next, we have to tackle the term (E’H’ — H'E’) y in (16.86). We use (16.82), (16.83) 
and the relation E’ = E + ed to write this term as 


(E'H' — H'E')y = e(oH' — H'¢) y = ce(ga: p'—a- p'o)y , 
hce 


= er -Vy—a-V(oy)), 
whee, 
= æa- (V) y = heg, Ey. (16.93) 
E is the electric field strength which is assumed to be given by E = —Vẹọ, in other words, 
there is no induced electric field and so A = 0. Now substituting (16.92) and (16.93) into 
(16.86), we obtain, 


2,72 hec hce 12 
cp ee o: B+ a E w=Ey. (16.94) 
To make a comparison with the non-relativistic Schrödinger equation, we use the expres- 
sions, 
1 h 1 2 
p= Fe F ; E=E+ted=m.c+W-+ed. (16.95) 
wi 


The reason for writing the expression for the energy of the electron as mec? + W is that, 
throughout the analysis of this section, E has been the total energy of the electron and so 
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to compare with the Schrödinger equation, we need to separate out the rest mass energy. W 
represents the energy level of the electron according to non-relativistic physics. Carrying 
out these substitutions and dividing through by 2m.c?, we find 


h? v2 6\+ he A.V + e? 22 
822m, Í 2micm, 2m,c? 


2 
E)| v= | + | W. (16.96) 

















e ue e 
o. — a 
4 mec Animec 2mec? 


It is convenient to compare this equation with Schrödinger’s non-relativistic wave equa- 
tion (14.37), 





2 

(- A V?y — eo) =W, (16.97) 
822me 

where we have written ¢ = e/4me gr and E = W. If we remove all terms in 1/c from 

(16.96), we see that it is identical with Schrödinger’s equation (16.97). In this case, the 

four components of Y, [W1, W2, W3, Wa], all satisfy the same Schrödinger equation, and so 

Dirac’s equation reduces correctly to the non-relativistic case. 

The additional terms which appear in (16.96) are purely relativistic effects. The vector 
potential terms in the second round bracket on the left of the equation are the usual terms 
which appear when terms in | /c are included and the relativistic correction on the right- 
hand side, (W + eb)? /2m.c?, is a relativistic correction to the energy W. The remarkable 
feature is the presence of the terms in the third round bracket on the left-hand side of 
(16.96), 








e e 
. -E). 16. 
(u Anime ) vo 


Suppose the electron had magnetic dipole moment u and electric dipole moment u.. Then, 


classically, the electron would have additional energy contributions associated with the 
interaction with a magnetic and electric field which would be 


AW=u:-B+u.:-E. (16.99) 


Thus, the terms in (16.98) correspond to the electron having magnetic and electric moments 


he he 
> Me 








u (16.100) 


~ 4am,” ~ Anime" 
Notice that what have been written as products of vectors in (16.98) are in fact 4 x 4 
matrices. They are, however, closely related to the Pauli spin matrices. It is straightforward 
to derive the forms of the matrices ø and & from Dirac’s paper and these are given in the 
endnotes. The spin matrices ø are of particular interest. As an example, it is shown in 


endnote 8 that the o; spin matrix is 


0z = —i0x dy = 


(16.101) 


b= E E 
ooor 
m OC oO 
oreo 
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This is just an extension of the 2 x 2 spin matrix S, of (16.46) to a corresponding 4 x 4 
matrix which Dirac created by duplicating the 2 x 2 matrix and filling up the remaining 
elements with zeros. In fact, in his analysis of his equation, Dirac recognised that the use 
of 4 x 4 matrices duplicated the solutions for the spin states of the electron. 

The result (16.100) is the remarkable result of Dirac’s great paper. It shows that, in the 
relativistic formulation of quantum mechanics, there is necessarily a magnetic moment 
associated with the spin of the electron and that its magnitude is eh /47r mec. This is exactly 
the result derived empirically by Uhlenbeck and Goudsmit that the magnetic moment 
associated with the spin angular momentum had to be twice that associated with orbital 
angular momentum. 

In addition, (16.100) shows that there is also an electric dipole moment associated with 
the electron, but that it is imaginary. Dirac appreciated the problem, but expressed his view 
as follows: 


‘This magnetic moment is just that assumed in the spinning electron model. The electric 
moment, being a pure imaginary, we should not expect to appear in the model. It is 
doubtful whether the electric moment has any physical meaning, since the Hamiltonian 
in [(16.81)] that we started from is real and the imaginary part only appeared when we 
multiplied it up in an artificial way in order to make it resemble the Hamiltonian of 
previous theories.’ (Dirac, 1928a) 


It has been worth all the effort to produce these remarkable results which formally 
demonstrate the origin of the spin of the electron in quantum mechanics. The words of 
Lindsay and Margenau reveal just how remarkable it is: 


‘Dirac’s theory, therefore, produces the spin properties without a special postulate, and 
this is its major achievement. ... But equation [(16.96)] also warns us not to take the 
electron spin too literally. The equation merely indicates in a formal way that the electron, 
if placed in a field, has an added energy part of which may be interpreted by saying that the 
electron spins. ... Furthermore, there is no term in [(16.96)] which could be interpreted 
as energy due to mechanical rotation. On the whole, then, the situation is more complex 
than would be in accord with the simple classical statement: the electron spins.’ (Lindsay 
and Margenau, 1957) 


16.7 The discovery of the positron 


16.7.1 Dirac’s prediction of the anti-electron and antimatter 
The existence of negative energy states with E = —m.c? according to Dirac’s theory was a 
major concern. According to quantum mechanics, these states could not be ignored. Indeed, 
they had to be included if Dirac’s relativistic quantum theory was to reduce correctly to the 
classical results. In a letter to Pauli of 31 July 1928, Heisenberg showed that it was necessary 
to include these negative energy terms if the correct expression for the dispersion formula 
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was to be obtained. Furthermore, Oskar Klein and Yoshio Nishina in their derivation of the 
relativistic quantum theory of the scattering of high energy radiation by electrons found that 
it was essential to include the negative energy states in order to account for the Compton 
scattering properties of electrons at energies hv > m.c? (Klein and Nishina, 1928, 1929). 

Dirac proposed an ingenious solution to the problem by invoking Pauli’s exclusion 
principle. He came up with a picture in which the Universe is so densely packed with 
electrons in the negative energy states that they are all filled and so the electrons cannot 
make transitions from the positive to the negative energy states. The proposal was that the 
‘vacuum’ was filled with electrons, what became known as the ‘Dirac sea’. But there might 
well be vacancies, or ‘holes’ in the sea, a situation similar to that which gives rise to X-ray 
lines when an electron is removed from, say, the K-shell of the atom. On 26 November 
1929, Dirac wrote to Bohr, 


‘Such a hole... would appear experimentally as a thing with positive energy, since to 
make the hole disappear (that is, to fill it up) one would have to put negative energy into 
it. Further one can easily see that such a hole would move in an electromagnetic field 
as though it had positive charge. These holes I believe to be protons. When an electron 
of positive energy drops into a hole and fills it up, we have an electron and a proton 
disappearing simultaneously and emitting radiation in the form of radiation.’ 


These ideas were published by Dirac in a paper entitled A theory of electrons and protons in 
the 1 January 1930 edition of the Proceedings of the Royal Society of London (Dirac, 1930b). 
The theory was immediately challenged on a number of grounds, the most significant 
being the conclusion demonstrated independently by Igor Tamm, Robert Oppenheimer and 
Hermann Weyl that Dirac’s theory predicted that the electrons and holes should have the 
same mass. Ultimately, Dirac withdrew his proposal and replaced it by the idea that the 
holes would be ‘anti-electrons’. In his words, 


‘It appears that we must abandon the identification of the holes with protons and must 
find some other interpretation for them. Following Oppenheimer (1930), we must assume 
that in the world as we know it, all, and not nearly all, of the negative-energy states for the 
electrons are occupied. A hole, if there were one, would be a new kind of particle, unknown 
to experimental physics, having the same mass and opposite charge to an electron. We 
may call such a particle an anti-electron.’ (Dirac, 1931) 


Dirac went further and stated that the protons must be unconnected with the electrons 
and that both electrons and protons should have negative energy states, thus introducing 
the concept of both anti-electrons and antiprotons (Dirac, 1931). This represented the 
introduction of the concept of antimatter into physics. 

This famous prediction was only the prelude to the principal concern of his paper 
which was entitled Quantized singularities in the electromagnetic field and concerned the 
theoretical possibility of the existence of magnetic monopoles (Dirac, 1931). He was well 
aware of the fact that this concept involved ‘a symmetry between electricity and magnetism 
quite foreign to current views’. His predicted value for the elementary magnetic pole 
strength was u = hc/4ze. Despite many investigations, the magnetic monopole has never 
been detected. 
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16.7.2 The discovery of positive electrons — the positrons 


The late 1920s and the 1930s were periods of unprecedented discovery in atomic and 
nuclear physics. These will be summarised in Chapter 18. Suffice to say that the discoveries 
of the positron, the neutron, artificially induced nuclear interactions, the discovery of the 
mesotron and so on were all rapidly built into the pantheon of the new physics with quantum 
mechanics at its core. Here we are only concerned with the discovery of the positron. 

We need to retrace our steps somewhat to take up the story of the discovery of cosmic 
rays which were to provide a natural source of very high energy particles.’ In the early 
1900s, it was known that there is a small amount of residual ionisation in the atmosphere 
and, close to the surface of the Earth, this could be attributed to the effects of natural 
radioactivity in rocks. In their pioneering experiments, Victor Hess and Werner Kolhörster 
made high altitude balloon observations of the degree of ionisation of the atmosphere and 
showed that above about 2 km altitude, the ionisation increases with increasing altitude. 
This phenomenon was attributed to some form of cosmic radiation which originated from 
above the Earth’s atmosphere (Hess, 1913; Kolhörster, 1913). They demonstrated that the 
increase was exponential with increasing height, with n(/) « exp(a/) and œ ~ 1073m7!. 
This increasing ionisation corresponded to a path length of much more penetrating radiation 
than was found for the most penetrating y-rays observed in radioactive decays. As Hess 
expressed it in his paper, 


“The results of the present observations seem to be most readily explained by the assump- 
tion that a radiation of very high penetrating power enters our atmosphere from above, 
and still produces in the lower layers a part of the ionisation observed in closed vessels.’ 


Originally it was assumed that the cosmic rays, as they were named by Millikan in 1925, 
were high energy y-rays with greater penetrating power than those observed in natural 
radioactivity. In 1929, Dmitri Skobeltsyn, working in his father’s laboratory in Leningrad, 
constructed a cloud chamber which was placed in the jaws of a strong magnet so that the 
curvature of the tracks of the charged particles could be measured. Among the tracks, he 
noted some which were hardly deflected at all and had all the appearance of being electrons 
with energies greater than 15 MeV. He identified them with secondary electrons produced 
by the ‘Hess ultra y-radiation’. 

A key technical development for these studies was the invention of the Geiger-Müller 
detector by Hendrik (Hans) Geiger and Walther Miiller in 1928. This enabled individ- 
ual cosmic rays to be detected and the times of their arrival measured precisely (Geiger 
and Müller, 1928, 1929). In 1929, Böthe and Kolhörster carried out one of the key ex- 
periments in cosmic ray physics in which they introduced the concept of coincidence 
counting to eliminate spurious background events (Bothe and Kolhörster, 1929). By us- 
ing two counters, one placed above the other, they found that simultaneous discharges 
of the two detectors occurred very frequently, even when a strong absorber was placed 
between the detectors. In one of the key experiments, slabs of lead and gold 4 cm thick 
were placed between the counters and the mass absorption coefficient was found to agree 
very closely with that of the attenuation of the cosmic radiation in the atmosphere. The 
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One of the discovery records of the positron. This cloud chamber photograph shows a 63 MeV positron passing through 
a6 mm lead plate and emerging as a 23 MeV positron. According to Anderson, the length of the latter path is at least 
10 times greater than the possible length of a proton path of this curvature (Anderson, 1933). 


experiment demonstrated that the cosmic radiation consisted of highly energetic charged 
particles. 

From the 1930s until about 1960, the cosmic radiation provided a natural source of very 
high energy particles, of very much greater energies than those produced in radioactive 
decays. In 1930, Millikan and Anderson used an electromagnet 10 times stronger than that 
used by Skobeltsyn to study the tracks of particles passing through the cloud chamber. 
Anderson (1932) observed curved tracks identical to those of electrons, but with positive 
electric charges (Fig. 16.2). 

This discovery was confirmed by Patrick Blackett and Guiseppe Occhialini in 1933 
using an improved technique in which the cloud chamber was only triggered after it was 
certain that a cosmic ray had passed through the supersaturated vapour within the chamber 
(Fig. 16.3) (Blackett and Occhialini, 1933). They obtained many excellent photographs of 
the positive electrons, on many occasions showers containing equal numbers of positive 
and negative electrons created by cosmic-ray interactions within the body of the apparatus 
being observed. Blackett and Occhialini’s analysis went considerably further than that of 
Anderson in that they interpreted the positive and negative electrons as being produced 
simultaneously in the interaction of the incoming cosmic ray particle with the material of 
the chamber. They stated: 
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C Blackett and Occhialini’s automatic cloud chamber with which they carried out the experiments described in their 
paper of 1933 (Blackett and Occhialini, 1933). 


‘In this way one can imagine that negative and positive electrons may be born in pairs 
during the disintegration of light nuclei. If the mass of the positive electron is the same as 
that of the negative electron, such a twin birth requires an energy of 2m.c? ~ 1 million 
[electron] volts, that is much less than the translatory energy with which they appear in 
general in the showers.’ (Blackett and Occhialini, 1933) 


They used Dirac’s calculations to show that the annihilation of the positron would occur 
very rapidly resulting in the formation of a pair of high energy photons. Thus, their paper 
introduced the concept of pair-production and annihilation of electrons and positrons. These 
experiments were conclusive evidence for the positron as predicted by Dirac’s theory of the 
electron and the first example of the existence of antimatter. 
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In completing the story of spin, we have run far ahead of the continued development of 
the understanding of the matrix, operator and wave mechanical approaches to quantum 
mechanics. The reconciliation of these approaches was described in Chap. 15, but there 
remained the issue of the interpretation of the wavefunction and the deeper implications of 
the theory. The understanding came gradually with Born’s interpretation of the wavefunc- 
tion, Ehrenfest’s demonstration of the equivalence of the classical and quantum pictures and 
Heisenberg’s enunciation of the uncertainty principle. These led to what became known 
as the Copenhagen interpretation of quantum mechanics. At the same time, the formal 
mathematical foundations of the different approaches to quantum phenomena were set on 
a secure foundation thanks to the efforts of Hilbert and many others. These developments 
resulted in what may be referred to as the completion of quantum mechanics, in the sense 
that it laid the foundations for all the future development of physics at the atomic and 
subatomic level — some of these achievements are summarised in Chap. 18. 


17.1 Schrodinger’s interpretation (1926) 


343 


Schrödinger regarded wave mechanics as superior to the matrix mechanical approach 
to quantum physics, not only because it was based upon the well-known eigenfunction 
techniques of classical physics, but also because it was much more visualisable. His first 
attempt at interpreting the wavefunction appeared in the final Sect. 7 of the fourth part of his 
great series of papers (Schrédinger, 1926f) and was entitled On the physical significance 
of the field scalar. There he identified the quantity yw* as the ‘weight function’ of the 
distribution of charge so that pe = ey y* is the electric charge density. In support of this 
picture, he carried out an analysis using the time-dependent wave equations (14.102) to 
evaluate the rate of change of charge density according to this prescription. The rate of 
change of charge is then 


ə 
= fevwow=ef (yaw "var. (17.1) 


Using the Schrödinger equations (14.102) for dy/dt and aw" /dt, 





ey y*pdV = I roe (vv — WV") dV. (17.2) 
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We now use Green’s theorem to convert the integral on the right-hand side into a surface 
integral: 








a h 
> fevwrar =- f www -wvw).aA, (173) 
ot a 4rime 
where dA is an element of surface area. Next, we define the vector S as 
he 
S= T W'Vy—wVYy*) , (17.4) 
mime 


and then convert the surface integral back into a volume integral using the divergence 
theorem, 


ə 
F evy'pav = - f pdivsav. (17.5) 
Taking the time differential inside the integral, (17.4) is the equation of continuity for the 
conservation of electric charge in classical electrodynamics, 


dpe 
ot 





+divS=0, (17.6) 


recalling that the density of charge is defined as pe = eyw*. S is the electric current 
density, corresponding to the quantity J which appears in Maxwell’s equations of classical 
electrodynamics. 

Schrödinger’s interpretation was reinforced by his analysis of the harmonic oscillator ac- 
cording to wave mechanics which was presented in Sect. 14.6 and illustrated in Fig. 14.3. His 
interpretation was that the particles are represented by wave groups composed of the super- 
position of an infinite series of wavefunctions and that Fig. 14.3 represents the oscillation of 
the electric charge in a harmonic oscillator. Therefore, just as in classical physics, the oscil- 
lating charge emits dipole radiation at the frequency of the oscillator. Schrédinger believed 
that the motion of the electron in the hydrogen atom could be interpreted in the same way: 


“We can definitely foresee that, in a similar way, wave groups can be constructed which 
move round highly quantised Kepler ellipses and are the representation by wave mechanics 
of the hydrogen electron. But the technical difficulties in the calculation are greater than 
in the especially simple case which we have treated here.’ (Schrödinger, 1926a) 


But this interpretation could not be correct as was pointed out by Heisenberg and Born. 
In general, the wave-packets do spread out in space. Heisenberg showed that the case of 
the harmonic oscillator was rather special because the successive energy levels are spaced 
by equal amounts. Furthermore, in general, the wavefunction W is a function in a multi- 
dimensional space and so the interpretation of the motion of the electron as a wave-packet in 
three-dimensional space is not viable. Heisenberg was particularly opposed to Schrödinger’s 
interpretation which seemed to omit many of the very considerable achievements of quan- 
tum physics. As he wrote to Pauli in June 1926, 


“The more I ponder about the physical part of Schrödinger’s theory, the more horrible 
I find it. One should imagine the rotating electron, whose charge is distributed over the 
entire space and which has an axis in a fourth and fifth dimension. What Schrödinger 
writes about the visualizability of his theory .. . I find rubbish.’ 
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Soon after, he complained that, while he admired the power of Schrédinger’s equation 
in simplifying the evaluation of matrix elements in quantum mechanics, Schrédinger’s 
interpretation 


‘throws overboard everything which is ‘quantum theoretical’: namely, the photoelectric 
effect, the Franck[-Hertz] collisions, the Stern—Gerlach effect, ...’ 


In addition, there was the problem of understanding the electron diffraction experiments 
in crystals and collision phenomena involving electrons. In interpreting these experiments, 
it had to be assumed that the waves dispersed, the classical analogue being Huygens’ 
construction for the interference of light waves, and so how could the stability of the 
particle as a discrete entity be explained by Schrédinger’s picture? The new interpretation 
came from Born’s study of the scattering of electrons by atoms. 


17.2 Born’s probabilistic interpretation of the 
wavefunction y (1926) 


In his Nobel Prize speech of 1954, Born explained that he was opposed to Schrödinger’s 
interpretation: 


‘On this point, I could not follow him. This was connected with the fact that my Institute 
and that of James Franck were housed in the same building of the Göttingen University. 
Every experiment by Franck and his assistants on electron collisions (of the first and 
second kind) appeared to me as a new proof of the corpuscular nature of the electron.’ 
(Born, 1961a) 


Born set about carrying out a quantum mechanical calculation of the scattering of charged 
particles such as a-particles or electrons by atoms using wave mechanical techniques. As 
he remarked, 


‘“... among the various forms of the theory, only Schrödinger’s formalism proved itself 
appropriate for this purpose; for this reason I am inclined to regard it as the most profound 
formulation of the quantum laws.’ (Born, 1926a) 


These remarks were made in a preliminary report of his calculations which were explained 
in much more detail in two further papers (Born, 1926b,d). The analysis involved what 
became known as the Born approximation, in which the incoming particle is represented 
by plane-wave functions incident upon the scattering centre from the positive z-direction. 
The scattered outgoing waves are represented by plane waves at infinity. Born treated the 
scattered wave as a first-order perturbation of the combined unperturbed wavefunctions of 
the scattering centre and the incoming particle. If the unperturbed wavefunction of the atom 
is y°(q) and the energy of the incoming electron E = p?/2m. = h?/2m.X?, he took the 
eigenfunction of the unperturbed system to be 


wo.(q,2z) = Wy(q) sin(2xz/A) , 07.7) 
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where n labels the nth wavefunction and q the spatial coordinates relative to the scattering 
centre. Then, if V(x, y,z,q) is the potential energy of interaction between the charged 
particle and the atom, he could apply the techniques of perturbation theory to evaluate the 
amplitude of the scattered plane wave at infinity, 


WWax20= J / deo va, B, y) sink ax + By + yz +8) ¥°Cg). (17.8) 


This equation has the following meaning. The superscript (1) means the first-order per- 
turbation solution for the scattered wave. The double integral is over the solid angle dw, 
which is the element of solid angle in the direction of the unit vector, the components 
of which are œ, 6 and y; ô is an additional scalar phase factor. ye (a, B, y) is a wavefunc- 
tion which determines what is now referred to as the differential cross-section for scattering 
in the (a, 6, y) direction. Since all the experiments on electron scattering indicated that 
the scattered electrons had a ‘corpuscular nature’, Born inferred that the only possible 
interpretation of the expression We (a, B, y)|” was that it represented the probability that 
the electron, approaching along the z-axis, is scattered in the a, 6, y direction. The wave- 
function R(x, y, Z, q) was then related to the total cross-section for the scattering of the 
electron by the atom. 

The implications of Born’s calculations were profound. He concluded that quantum me- 
chanics does not answer the question, “What, precisely, is the state of the system after the 
collision?’, but rather the question ‘What is the probability of a particular state after the 
collision?’ Thus, Born introduced the concept that the wavefunction y and the square of 
its amplitude |yr?| determine the probabilities of events occurring in quantum mechanics. 
Born’s thinking was strongly influenced by Einstein’s interpretation of the relation be- 
tween electromagnetic waves and light quanta, the very heart of the wave-particle duality. 
According to Born, Einstein interpreted the electromagnetic field E(x, y, z, t) as a ‘phan- 
tom’ or ‘ghost’ field, a ‘Gespensterfeld’, which served to guide the light quanta. The 
intensities, and hence the density of light quanta, were determined by the square of the 
amplitude of the electromagnetic field. In Born’s interpretation, the strict equivalence of 
the wavefunction and the properties of electromagnetic waves was reinforced by the com- 
parison of the wavefunction of a particle of energy E and momentum p and the expression 
for the amplitude of an electromagnetic wave, 


exp [2ziv (1 — -)| exp |= (Et — 2] : (17.9) 


a a 
Electromagnetic wave de Broglie wave 


where v = E/h and à = h/p. The first expression is proportional to the amplitude of the 
electromagnetic wave and the second to the amplitude of the de Broglie wave associated 
with the electron. Since the energy density of the light waves depends upon the square of 
the amplitude of the waves, Born translated this into the statement that 


‘... it was almost self-understood to regard |y |? as the probability density of particles.’ 


It was soon appreciated that these probabilities are different from those used in classical 
statistical mechanics and the theory of Gaussian statistics. Einstein had already understood 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:51 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.018 
Cambridge Books Online © Cambridge University Press, 2014 





347 


17.2 Born’s probabilistic interpretation (1926) 


the difference between the statistics of waves and particles in his great paper of 1909 
which was analysed in Sect. 3.6! (Einstein, 1909) . According to classical statistics, if pı 
and p are the probabilities of outcomes 1 and 2 taking place in some experiment, the 
combined probability that one or other of them occurring is pı + p2. Translating this into 
Born’s interpretation of the wavefunctions, this would mean that the probabilities would 
correspond to ||? + |2]?, where yı and y7 are the wavefunctions associated with pı 
and p2. But this is not the correct rule for the superposition of waves. We need first to form 
the sum of the two wavefunctions yı + yn and then the probability is given by the square 
of the modulus of the joint wavefunction, 


pu = Wi +y? = Wi? + lel? + is + rove. (17.10) 


The last two terms are the ‘interference terms’ which give rise to the phenomena of electron 
diffraction and the ‘wave’ aspects of the wave—particle duality. Note that these are similar 
in form to the terms which appear in the statistical properties of electromagnetic waves (see 
Sect. 3.6). 

Jammer’s commentary on what Born had achieved is revealing: 


‘For Einstein the notion of probability, even as he applied it to reconcile his light-quantum 
hypothesis with Maxwell’s theory of electromagnetic waves, was the traditional concept 
of classical physics, a mathematical objectivisation of the human deficiency of complete 
or exact knowledge but ultimately a creation of the human mind . . . For Born probability, 
as far as it was related to the wave function, was not merely a mathematical fiction 
but something endowed with physical reality, for it evolved with time and propagated 
in space in accordance with Schrédinger’s equation. It differed, however, from ordinary 
physical agents in one fundamental aspect: it did not transmit energy or momentum. 
Since in classical physics, whether Newtonian mechanics or Maxwellian electrodynamics, 
only what transfers energy or momentum (or both) is regarded as physically ‘real’, the 
ontological status of y had to be considered as something intermediate.’ (Jammer, 1989) 


Born appreciated that the wavefunction w could be expressed as an expansion in a com- 
plete, orthonormal set of eigenfunctions which are solutions of the appropriate Schrédinger 
equation, 


a cn (17.11) 


with a completeness relation which results from the orthogonality relations for the eigen- 
functions, 


J IWP dg = $. lcnl? . (17.12) 

Born then interpreted f |w(q)|? dq as the total number of particles and the |c,|? as the 

statistical frequency of occurrence of the nth eigenstate of the solutions of Schrödinger’s 
equation. 

This interpretation had a number of immediate successes. Wentzel (1926b) used Born’s 


wave mechanical approach to derive the Rutherford scattering formula, while Faxén and 
Holtsmark (1927), Bethe (1930) and Mott (1928) used Born’s approach to study the passage 
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of fast and slow particles through matter. Among the successes of these papers was the 
interpretation of the Ramsauer—Townsend effect, the minimum in the cross-section for the 
scattering of low-energy electrons in the noble gases, argon, krypton and xenon (Ramsauer, 
1921; Townsend and Bailey, 1922) — this phenomenon had no explanation according to 
classical physics, but is found naturally in the quantum theory of electron scattering. 

Born (1926c) next tackled the interpretation of the time-dependence of the wavefunction 
using the time-dependent Schrödinger wave equation, 





822me 4rim ow = 


Vw P U(x) ary 0. (17.13) 


Assuming the wavefunctions y,,(x) are normalised, the general solution was taken to be 


Yæ, t)= X cnWnlx) exp (= Wn ) , (17.14) 


n 


where W, is the energy of the nth eigenstate. Therefore, at time t = 0, the wavefunction is 


W(x, 0) = Yo crp). (17.15) 


Born considered the result of applying a force F(x, t) to the system which acts only during 
the time interval 0 < t < T. He treated the action of this force as a small perturbation to 
the potential U(x) so that U(x) was replaced by U(x) + x F(x, t) where the factor x is 
later used as a small expansion parameter. Using the perturbation techniques described by 
Schrödinger in Part 4 of his series (Schrödinger, 1926f), Born demonstrated that, for the 
simple case in which y(x, t) consists of only a single eigenfunction w,,(x), the solution of 
the time-dependent problem for t > T is 


Wr, 0) = > bam W(x) exp (= Wn r) ' (17.16) 


m 


the coefficients ,,,, being determined by the action of the force F(x, t) during the interval 
0 < t < T. Inthe spirit of the probability interpretation of the wavefunction, he immediately 
interpreted the quantity |bnm|? as the probability that the system changed from the initial 
state n to the final state m, in other words, the quantities |bam ? represent the transition 
probabilities between the states n and m. 

Having dealt with a single wavefunction %,„(x), Born could then generalise to the case 
in which the initial state is given by (17.15) and so, fort > T, 


PEN = Y cryn, t). (17.17) 


n 


He now worked out the total transition probability from the state n by writing 


VE N= > Crne), (17.18) 
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for times t > T. Using the orthogonality properties of the wavefunctions w,,(x), he derived 
the key expression 


Cr = I W(x, T) w*(x) dx = $ bmn Cm Exp (= Wn r) (17.19) 


and so 


IC?’ = 


> Cm bmn 


m 


(17.20) 








In carrying out this calculation, Born had uncovered the rules for the determination of 
probabilities in quantum mechanics which differ crucially from their classical counter- 
parts. Thus, classically, if the transition probability from the state m to n is Pi = \Bam|? 
and the probability of that state m occurring is P) = |Cm ?, the joint probability would be 
P = P, P = |bym|? |em|?, which is quite different from the quantum rule for the addition 
of amplitudes of the wavefunctions to generate the probabilities (17.20) — note the same dif- 
ferences between classical and quantum probabilities as described by (17.10). As expressed 
by Jammer (1989), 


*... Born advanced two theorems which were destined to play a fundamental role in the 
further development of quantum theory, its interpretation, and its theory of measurement: 


1. the theorem of spectral decomposition according to which there corresponds a possible 
state of motion to every component y, in the expansion or superposition of Y; 

2. the theorem of interference of probabilities according to which the phases of the 
expansion coefficients, and not only their absolute values, are physically significant.’ 


These are profound insights into the physical meaning of quantum mechanical calculations. 
The amplitudes and phases of the wavefunctions are both crucial in describing quantum 
phenomena and consequently complex numbers, which incorporate both amplitude and 
phase, are the natural language of quantum mechanics. These results were independently 
discovered by Dirac in his important paper On the theory of quantum mechanics using his 
rather different approach to quantum mechanics (Dirac, 1926f). 


17.3 Dirac-Jordan transformation theory 
| or DE EEG EEE Er er | 


Born’s probabilistic interpretation of the wavefunction was a key advance which was to 
be deepened as the formalism of quantum mechanics developed. At the same time, while 
the matrix and wave mechanical approaches had been shown to be equivalent, a unifying 
mathematical theory was not available at the time Born published his paper in 1926. We 
took the story of Dirac’s formulation of quantum mechanics as far as his solution of the 
problem of the hydrogen atom in Chap. 13 and then jumped ahead to his discovery of the 
relativistic version of the Schrödinger equation, the Dirac equation, the magnetic moment 
of the electron and the prediction of the positron and antimatter in Chap. 16. 
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We pick up the story from Sect. 13.3 where the elements of what became known as 
transformation theory were described in the context of his introduction of g-numbers and 
their algebra. There, we encountered the rules for the transformation of the canonical 
variables Q, and P,, which are functions of g-numbers, obeying the following rules in 
terms of their Poisson brackets, 


[O-, Ps] = brs , [O,, Qs] =[P., P] =0. (17.21) 


Then, Q, and P, can be transformed to another set of canonical variables q, and qs by 
relations of the form 


Q, = bq,b", P. = bp,b"' , (17.22) 


where b is a g-number. We observed the strong similarity to the canonical transformations 
(12.64) introduced by Born and his colleagues in the matrix treatment of quantum algebra. 
As Born remarked, 


‘A function $ is to be determined, such that when 
p=Sp,s', q=SqS", (17.23) 
the function 
H(p, q) = SH(Po, 9,5" =W , (17.24) 
becomes a diagonal matrix.’ (Born et al., 1926) 


The rigorous proof of this result was given by Jordan (1926) who developed the theory for 
action and angle variables. Despite these advances, the theory was of limited usefulness 
since, in general, it proved difficult to find the reciprocal matrices S-!, Furthermore, the 
matrix mechanical approach could not deal with the case of constant p. With Schrödinger’s 
demonstration of the equivalence of the matrix and wave mechanical approaches to quan- 
tum phenomena, London (1926) carried over the concepts of matrix mechanics into the 
framework of Schrédinger’s wave mechanics. It was only after this work was completed that 
it was realised that London had formulated the problem using closely analogous procedures 
to those employed in the use of linear operators in functional spaces, which were discussed 
in detail in Chap. 15. Jammer (1989) summarises London’s achievements and what Dirac 
published a few weeks later as follows: 


‘[The paper by London] started by applying canonical transformations to the wave me- 
chanics of discrete eigenvalue problems and ended up with discrete transformation ma- 
trices. A few weeks later, Dirac (1926g) published a paper which began by applying 
canonical transformations to continuous or discrete matrices in Dirac’s matrix mechanics 
of continuous and discrete eigenvalue problems. Dirac’s work thus complemented Lon- 
don’s in two respects: it showed, so to speak, the reversibility of the conceptual process 
under discussion and generalised it to continuous transformations.’ 


Dirac had no doubt about the central importance of the transformation theory of quantum 
mechanics. As he wrote in the preface to the first edition of his classic text The Principles 
of Quantum Mechanics (Dirac, 1930a), 
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*,.. the growth of the use of transformation theory ...is the essence of the new method 
in theoretical physics.’ 


Dirac’s paper contained a number of innovations which were inspired by his appreciation 
of Lanczos’s rewriting of matrix mechanics in terms of integral equations (Lanczos, 1926) 
(see Sect. 15.2). Dirac appreciated that the elements of the matrices were now continuous 
functions and the matrices were continuous matrices. 


17.3.1 Discrete and continuous matrices 


We can illustrate the similarities and differences between discrete and continuous matrices 
by comparing Fourier series and Fourier integrals (Bohm, 1951). In the case of a Fourier 
series expansion, a wavefunction y(x) can be written as a Fourier series 


W(x) = Do an Vn), (17.25) 


where w,,(x) may be taken to be, for example, the complete set of orthonormal harmonic 
functions exp(2zinx /L), where 0 < n < ow. Thus, if the functions w,(x) form a complete 
orthonormal set, any wavefunction can be expressed as the sum (17.25) over all the discrete 
values of n. Now consider a new wavefunction ¢,,(x) obtained by applying the operator A 


to Wm(x), 
AWm(x) = Öm(X) : (17.26) 


Since the w,,(x) form a complete orthonormal series, m(x) can be expressed as a sum over 
all the components of the series so that 


Abm) =), Anm Yr). (17.27) 


The values of anm can be found using the orthonormal properties of the wavefunctions Yn 
in the usual way 


Anm = f W(x) A Ym (x) dx A (17.28) 
Now the effect of operating on any function w(x) can be found by writing 


AGG) =A Am) Cn Apna) =F Y Cetin int). (17.29) 
In the above exposition, the matrix elements anm are associated with the discrete eigen- 
functions y. 
In the case of Fourier transforms, the discrete functions ¢,(x) are replaced by continuous 
functions. Thus, representing the function y by a Fourier integral 


_ lt ik-x 
vi) = = | oe dk, (17.30) 


the orthonormal functions are now the continuous set of functions e'** and (k) are the 
corresponding continuous expansion coefficients. The matrix elements azy are now found 
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by analogy with (17.28): 
1 vn oy 
akw = fear "de, (17.31) 
2n 


with the key distinction that k and k’ are now continuous functions, rather than the discrete 
functions associated with the Fourier components n and m. Thus, for any set of continuous 
matrices Yp, we may define the matrix element a pp 


app = J We Ary dx. (17.32) 


The following expressions, similar to those found for discrete matrices, follow naturally. 
Any w(x) can be represented by 


W(x) = f Cede. (17.33) 


and so if the operator A acts upon Y(x), 


AwW(x)= N Cp App Wp (x) dp’ dp. (17.34) 


The product rule for continuous matrices becomes 


(AB) pp = Sender dp”. (17.35) 


17.3.2 Dirac’s interpretation of quantum mechanics 


Dirac now translated the canonical transformation (17.22) of the dynamical variable g into 
G into the language of continuous matrices (Dirac, 1926f). The transformation 


G = bgb"! (17.36) 


was written in terms of integrals over continuous matrices as 


g(E’E”) = If (5) da’ g(a’) da” (5) f (17.37) 


following the rules described in Sect. 17.3.1. The primed and double primed quantities are 
continuous parameters, c-numbers, which number the rows and columns of the matrix ele- 
ments; (€’/a’) and (@” /&”) represent the transformation functions b(é’ /a’) and b~!(a” /&”) 
respectively; g(€’&”) and g(a@’a") are dynamical variables and are g-numbers. 

One of the objectives of Dirac’s paper was to discover the means of evaluating the trans- 
formation functions b(é’/a’) and b~!(a@” /E"). In the course of his analysis, he introduced 
the famous Dirac ö-function. His training in electrical engineering proved to be invaluable 
since, as he said, 


‘All electrical engineers are familiar with the idea of a pulse, and the ö-function is just a 
way of expressing a pulse mathematically.’ 
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The ö-function had been introduced by Kirchhoff and was used extensively by Oliver 
Heaviside in his pioneering studies of electromagnetic theory. The ö-function was defined 
in the usual way as 


öx)=0 forallx #0 and faoa =]; (17.38) 


Dirac was well aware of the fact that the ö-function is what Jammer calls a ‘convenient 
mathematical artifice’ and was candid about its mathematical status: 


‘Strictly, of course, ö(x) is not a proper function of x, but can only be regarded as a limit of 
a certain sequence of functions. All the same one can use ö(x) as though it were a proper 
function for practically all purposes of quantum mechanics without getting incorrect 
results. One can also use the differential coefficients of ö(x), namely ö(x), (x),..., 
which are even more discontinuous and less “proper” than 5(x) itself.’ 


Dirac went on to show that the n-th derivative of 5(x) could be written 


f j f(x) d™(a — x) dx = f(a). (17.39) 


This result was needed to define the elements of the unit continuous diagonal matrix 
I(a’, æ”) labelled by the continuous parameters a’ and a” as 


T(a’, œ”) = ö(a' — a”). (17.40) 
Therefore, the elements of the generalised continuous diagonal matrix could be written 
f, a") = fasl — a”). (17.41) 


These results were needed in the succeeding analysis in his paper. 

The spectacular result of Dirac’s analysis was his demonstration that the function (£’/«’) 
was just the appropriate solution of Schrédinger’s equation, with the substitutions & > q, 
n —> p, (E’/a') > We(q) and f(a’) > E. In Dirac’s own words 


“The eigenfunctions of Schrödinger’s wave equation are just the transformation functions 
(or the elements of the transformation matrix previously denoted b) that enable one 
to transform from the (q) scheme of matrix representation to a scheme in which the 
Hamiltonian is diagonal.’ 


In fact, Dirac’s analysis was a generalisation of Schrédinger’s wave equation and he pro- 
ceeded to provide a generalisation of Born’s interpretation of the wavefunction. Born’s 
result could be written 


vg) = >> ent) Yla), (17.42) 


where |c,(t)|? was interpreted as the probability that the transition to the state n took place. 
Now, Dirac generalised this result to the continuous energy ranges described by his new 
formalism so that the equivalent of (17.42) became 


vi. = | Ende ve), (17.43) 
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where |c(E, t)|? dE was to be interpreted as the transition probability to a state with energy 
in the range E to E + dE. Reformulating Born’ treatment of the scattering of electrons by 
atoms to his new formulism, he found perfect agreement, provided 


‘the coefficients that enable one to transform from the one set of matrices to the other are 
just those that determine the transition probabilities.’ 


Born’s statistical interpretation was also generalised by Pauli in a footnote to his paper 
on Fermi statistics (Pauli, 1927a). He stated that the probability of finding the position 
coordinates q1, q2, . . . , qf Ofa system of N particles in the volume element dgıdq2 . . . dq f 
of the configuration space was given by |W(q1, q2, - - - , qf)? dqi dq2 ... dq s, if the system 
is in a state characterised by y. As Pauli wrote to Heisenberg, 


‘Born’s interpretation may be viewed as a special case of a more general interpretation. 
Thus, for example, |y(p)|? dp may be interpreted as the probability that the particle has 
momentum between p and p + dp.’ 


He further asserted that, for every pair of quantum mechanical quantities g and ß, a function 
(q, B) exists, the ‘probability amplitude’, such that |6(go, B)|* dq is the probability that 
q lies between the values go and go + dq, if 6 has a fixed value o. These insights were to 
be used by Jordan to expound the first axiomatic approach to the statistical transformation 
theory. 


17.3.3 Jordan's axiomatic synthesis of the statistical transformation theory 


Inspired by Pauli’s concept of probability amplitudes, Jordan developed an axiomatic ap- 
proach to the transformation theory which was independent of London’s and Dirac’s for- 
mulations (Jordan, 1927). According to Jammer, the three major axioms of his approach 
were: 


(1) ‘[the probability amplitude] ¢(q, £) is independent of the mechanical nature (Hamil- 
tonian function) of the system and depends only on the kinematic relation between q 
and £. 

(2) the probability (density) that for a fixed value Bo of 6 the quantum-mechanical quantity 
q has the value qo is the same as the probability (density) that, for a fixed go of q, B 
has the value Bo. 

(3) probabilities are combined by superposition, that is, if (x, y) is the probability 
amplitude for the value x of q at a fixed value y of £, and x(x, y) is the probability 
amplitude for the value x of Q at the fixed value y of q, then the probability amplitude 
for x of Q at a fixed value y of £ is given by 


201.9) = | xo. 906. N. (17.44) 
In the particular case Q = £, Jordan’s P(x, y) becomes Dirac’s ö(x — y).’ 


In Jordan’s axiomatic approach, p is defined as the momentum canonically conjugate to q 
if the probability amplitude p(x, y), for every possible value x of p at a fixed y of q, is 
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given by 





p(x, y) = exp (=> ) l (17.45) 
Jordan inferred that for a fixed value of q, all possible values of p are equally probable and 
vice versa. 

Jordan’s somewhat formal and mathematical approach was not readily accessible, except 
to experts in transformational theory, but it had some remarkable features. It will be noted 
that (17.45) is a solution of the time-independent Schrödinger wave equation for a particle 
moving with constant momentum. Hence, if the momentum is precisely known, all values 
of q are equally likely, a precursor of the Heisenberg uncertainty principle. Jordan’s great 
achievement was that, by adopting the full apparatus of Hermitian operators, he was able 
to demonstrate that his theory encompassed not only Schrödinger’s wave equation and 
Heisenberg’s matrix mechanics, but also the Born—Weiner operator calculus and Dirac’s 
q-number calculus. The synthesis of all these approaches was to find its ultimate expression 
in the application of functional analysis to the formalism of quantum mechanics. 


17.3.4 The new perspective 


It is worthwhile reviewing what had been achieved as a result of Born’s probabilistic 
interpretation of the wavefunction and the Dirac—Jordan statistical transformation theory, 
following Jammer’s careful exposition. The achievements of Schrödinger and Heisenberg 
may be summarised as follows. Schrödinger’s discovery of his wave equation was the 
route for determining the energy eigenvalues of quantum mechanical systems. Similarly, 
in Heisenberg’s matrix mechanics, the solutions for p and q were found for the diagonal 
matrices for which the diagonal terms were the energy eigenvalues. The off-diagonal 
terms were interpreted as transition probabilities. The observables were the energies of 
the stationary states and the transition probabilities. But matrix mechanics could not deal 
with the motion of a free electron, nor was the position variable accorded any status in the 
scheme, being disguised though the process of taking Fourier transforms. 
In contrast, the transformation theory, to quote Jammer (1989), 


‘introduced the experimentally required generalisations by postulating that, in principle, 
any Hermitian matrix A represents an observable quantity a, on a par with the energy, and 
that the eigenvalues of A are possible results of measuring a. Through Dirac’s introduction 
of continuous matrices and Born’s probabilistic interpretation of Schrédinger’s wave 
equation, the notion of position was retrieved.’ 


17.4 The mathematical completion of quantum mechanics 
————z—z—z—z—z— Z — zz GG [FF FF FF FF 


In 1926, Heisenberg appealed to the mathematicians to take up the challenge of com- 
ing up with the mathematics which would underpin the different approaches to quantum 
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mechanics. He was fortunate that the leader of this attack was David Hilbert who was 
already well versed in the mathematical issues of quantum mechanics and was a close 
neighbour of Born’s at Heidelberg. In late 1926, he began a systematic study of the math- 
ematical foundations of quantum mechanics supported by his assistants Lothar Nordheim 
and John von Neumann. During the winter of 1926-1927, Hilbert gave a two-hour lecture 
every Monday and Thursday morning on the mathematics underlying quantum mechanics 
and a summary of these was published in 1927 (Hilbert et al., 1927). 

It would take us too far into pure mathematics to describe exactly what Hilbert and his 
colleagues achieved, but suffice to say that, as might be expected of the mathematician 
who had axiomatised the fundamentals of geometry, he set about coming up with a self- 
consistent and mathematically rigorous axiomatic basis for the application of probability 
amplitudes in quantum mechanics. They established six axioms which the amplitudes had 
to fulfil. Hilbert associated an operator with every dynamical variable and the resulting 
operator calculus was to be the mathematics of the probability amplitudes associated with 
each operator. Hilbert recognised, however, that strict mathematical rigour was not able 
alone to encompass the needs of quantum physics. As he stated, 


‘It is difficult to understand such a theory if the formalism and its physical interpretation 
are not strictly kept apart. Such a separation shall be adhered to even though at the 
present stage of the development of the theory no complete axiomatisation has as yet 
been achieved. However, what is definite by now is the analytical apparatus which will not 
admit any alternations in its purely mathematical aspects. What can, and probably will, 
be modified is its physical interpretation for it allows a certain freedom of choice.’ 


Hilbert adopted the integral equation approach to the operator formalism which he had 
already pioneered in his paper of 1912 (Hilbert, 1912). Summarising the results of their 
analysis, Hilbert and his colleagues found that the condition that the relative probability 
density is real and non-negative is that the operators had to be Hermitian. Furthermore, 
by introducing Dirac’s ö-function, they were able to derive both the time-independent and 
time-dependent Schrödinger wave equations for both the energy and position coordinates. 
Their transformation theory was fully consistent with Born’s probabilistic interpretation 
of the wavefunction, that is, that the function |y,,(x)|? is the probability that the atom is 
found in the nth state and that it is located at x while in that state. The Hilbert-Neumann— 
Nordheim reformulation of the transformation theory of Dirac and Jordan included both 
wave and matrix mechanics and laid the rigorous mathematical foundations for quantum 
mechanics. 

There remained, however, the awkward properties of Dirac’s 6-function which appeared 
in the reduction of the formal operator description in terms of integral equations into 
Schrödinger’s wave equation. The legitimising of Dirac’s ö-function was only to take place 
very much later.” Von Neumann therefore adopted a different approach involving concepts 
developed by Hilbert on linear equations and used these to provide a new mathematical 
framework for quantum mechanics (von Neumann, 1927). This provided the most suitable 
formalism for the elaboration of quantum mechanics and its subsequent extensions into 
relativistic quantum mechanics and quantum field theory. This involved the introduction of 
what von Neumann called Hilbert space, an infinite-dimensional complete separable linear 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:51 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.018 
Cambridge Books Online © Cambridge University Press, 2014 





357 


17.5 Heisenberg’s uncertainty principle 


space with a positive definite metric. Within this space, he developed a theory of linear 
operators. In one realisation, the operators become functionals as studied in functional 
analysis. Likewise, adjoint and Hermitian operators appeared naturally in the theory and 
the most general description of Born’s statistical and probabilistic interpretation of quantum 
mechanics was described. 

The formal elaboration of the technical details of von Neumann’s scheme was a 
formidable achievement and, although the overall scheme was laid out in his paper of 
1927, it was not until 1929 that he had solved all the formal issues involved (von Neumann, 
1929). The modern axiomatic exposition of quantum mechanics is ultimately founded upon 
von Neumann’s ground-breaking papers. These cannot be appreciated without consider- 
able effort and this qualitative summary does scant justice to his remarkable achievement. 
Jammer provides more of the mathematical details, but even he has to refer the interested 
reader to the original papers for a full appreciation of their mathematical content. 


17.5 Heisenberg's uncertainty principle 
aaa T) 


While the tools of quantum mechanics were approaching completion, the interpretation 
of the quantum mechanical operators and variables was still unclear. Heisenberg’s initial 
reaction had been to reject the concept of position and velocity inside atoms as having 
no meaning since they could not be observed — for Heisenberg, the only observables at 
the atomic level were the emission and absorption properties of atoms, their frequencies, 
intensities and polarisations of the radiation. Certainly, the concepts of position, velocity 
and momentum could not have their classical significance at the atomic level in view 
of the fundamental quantum non-commutability relation pg — qp = h/2ri. And yet, the 
formalism of quantum mechanics had its roots in classical physics which certainly worked 
splendidly for macroscopic bodies. Born’s probabilistic interpretation of the wavefunction 
offered a compelling way forward, indicating that the outcome of experiments at the atomic 
level have an intrinsic indeterminacy. This feature of quantum mechanics was certainly 
appreciated by Dirac (1926g) who remarked that 


“One cannot answer any question on the quantum theory, which refers to numerical values 
for both the p and the g. One would expect, however, to be able to answer questions in 
which only the q or only the p are given numerical values...’ 


Jordan (1927) likewise stated that ‘for a given value of q all values of p are equally possible’. 
Heisenberg and his colleagues fully appreciated that, according to quantum mechanics, it 
was 


... meaningless to speak of the place of a particle with a definite velocity ... But if 
one does not take it too seriously with the accuracy in using the notions of velocity and 
position, then it may well make sense.’ (Heisenberg, 1960) 


In a letter to Pauli of 28 October 1926, Heisenberg asserted that it made no sense to 
talk about a monochromatic wave at a definite instant or extremely short period of time. 
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After four months of deliberation on these issues, he came up with a self-consistent solu- 
tion to reconcile the classical and quantum interpretations of non-commuting variables, in 
particular, in defining the range of applicability of the classical concepts of position and 
momentum. These are embodied in what became known as Heisenberg s uncertainty prin- 
ciple. The concepts were contained in a 14-page letter to Pauli who reacted positively and 
enthusiastically to its contents. The contents of that letter constituted the bulk of Heisen- 
berg’s famous paper on the uncertainty principle which was submitted to the Zeitschrift fiir 
Physik at the end of March 1927 (Heisenberg, 1927). 

Heisenberg used the newly developed statistical transformation theory of Dirac and 
Jordan to define precisely the theoretically permissible values of non-commuting variables 
such as q and p in terms of the statistical distributions of their p and q values. As discussed 
in Sect. 17.3, this theory had the advantage of being able to deal with matrix elements 
which were continuous functions. Heisenberg used results derived by Jordan in his version 
of the Dirac-Jordan transformational theory (Jordan, 1927). Jordan took the probability 
amplitude for q to be of the following form 
qq—q'P _ 2nip'(q - 2] 

2q? h ` 





S(n,g) x exp | (17.46) 
This expression has the following meaning. n is some fixed parameter which will not enter 
into the argument. S(n, q) is the probability amplitude that the electron will be at position 
q if the mean value of the position is q’ with uncertainty qı. The probability is found by 
taking the square of the modulus of the probability amplitude and so 
— a2 
IS(7, q)? = SS* œ exp |---| Ä 
qi 


(17.47) 


This formulation has the advantage that it results in a Gaussian probability distribution of 
possible values of q with ‘uncertainty’ q1. 

Next, Heisenberg used the rules of the transformation theory to write down the corre- 
sponding probability amplitude for p by using the relation 


5.2) = J SC, a) Sq. p) da - (17.48) 


The function S(q, p) is given by (17.45) and so, carrying out the integration, Heisenberg 
found that the probability amplitude for p is 





(p-p? , 2nig'(p - p’) 
S(n, p) x exp | 37 + $ ; (17.49) 
1 
and the corresponding probability distribution for p 
_ we 
[S(n, p)|? = SS* x exp | (17.50) 
Pi 
where 
h 
Pin = 57: (17.51) 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:51 GMT 2014. 
http://dx.doi.org/10.1017/CB09781139062060.018 
Cambridge Books Online © Cambridge University Press, 2014 





359 


17.5 Heisenberg’s uncertainty principle 


This is Heisenberg’s uncertainty principle describing the intrinsic indeterminacy with which 
p and q can be determined and provides a statistical interpretation of the basic non- 
commutability relation pq — qp = h/2ri. As Heisenberg wrote, 


‘The more accurately the position is determined, the less accurately the momentum is 
known and conversely.’ 


In fact, the essence of Heisenberg’s calculation is most easily appreciated from the properties 
of the integral Fourier transforms of a Gaussian distribution, as was demonstrated by 
Darwin (1927d) — a simple analysis using that approach is presented in the endnote to 
this chapter.? It is striking that the Fourier transform approach indicates clearly the origin 
of the complex terms in the probability amplitudes (17.46) and (17.49) in Jordan’s and 
Heisenberg’s analyses. 

Let us convert (17.51) into conventional notation by rewriting the Gaussian distributions 
(17.49) and (17.50) in proper normalised form. In order that the distributions be standard 
Gaussians, their standard deviations Ap and Aq are related to pı and q; by Ap = p,/V2 
and Aq = q,/2. Hence, (17.51) may be written 


h 
Ap Aq = 7. (17.52) 


Ditchburn (1930) demonstrated that in fact, because of the choice of Gaussian distributions 
for p and q, the uncertainly relation (17.52) represents the minimum indeterminacy of p 
and q, the inequality applying for all non-Gaussian distributions, 


h 
Ap Aq > —. (17.53) 
4r 
Heisenberg’s great paper on the uncertainty principle was of central importance for many 
different aspects of the understanding of quantum physics and for physics in general. Let 
us highlight just a few of these consequences. 


1. At the most elementary level, the principle tells us the scales on which the classical and 
quantum theories are applicable. For example, applying the principle to an electron in 
the Bohr model of the atom, its velocity in the ground state, n = 1, is 2.2 x 10° m s7! 
and hence its momentum is p = mev = 2 x 107° kg ms7!. Therefore, taking Ap = p 
and setting Ap Ax = h/4r, we find Ax = h/4n Ap = 0.3 x 10!" m, roughly the size 
of the first Bohr orbit — this is no accident. What this calculation is telling us is that, on 
the scale of atoms, we cannot know precisely where the electron is at any moment — we 
can only describe very precisely where it is likely to be. 

2. More generally, the principle tells us that, at any time, we cannot define the state of any 
system at the microscopic level absolutely precisely. Thus, the concept of setting up a 
system with a perfectly defined set of initial conditions and then following precisely the 
future evolution of the system is not feasible. 

3. Another way of expressing this same concern is that causality at the microscopic level 
becomes meaningless since the intrinsic uncertainty means that we cannot predict exactly 
the outcome of any process. We can make accurate predictions about the various possible 
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outcomes of the experiment, but we cannot state with absolute certainty which will 
actually occur. 


4. The principle has profound implications for the theory of measurement and this topic 


became a major preoccupation in the theory of quantum processes. 

5. The principle had a major impact upon philosophy, in many ways demolishing many 
of the basic tenets of classical philosophy and logic. For example, the way in which 
probability amplitudes and probabilities are evaluated in quantum mechanics tells us that 
events which might have happened, but didn’t, can affect the outcome of an experiment. 
There is no scope for such constructs in classical physics, but they are an integral part 
of the Universe we live in. 

6. Perhaps most remarkable of all is the fact that statistical concepts were not explicitly 
incorporated into the basic postulates and structure of the theory, and yet the predictions 
are all of probability amplitudes and probabilities. Heisenberg should have the last word 
on this subject 


“We have not assumed that the quantum theory, unlike classical physics, is essentially 
a Statistical theory in the sense that from exact data only statistical data can be inferred. 
For such an assumption is refuted, for example, by the well-known experiments of Bothe 
and Geiger. However, in the strong formulation of the causal law, “If we know exactly 
the present, we can predict the future’ it is not the conclusion but rather the premise 
which is false. We cannot know, as a matter of principle, the present in all its details.” 


It is not perhaps surprising that it took some time before the full implications of Heisenberg’s 
calculations were fully appreciated since they ran contrary to many of the most cherished 
tenets of classical physics. Perhaps the most important conclusion of these calculations was 
the clear distinction between the realms of applicability of classical and quantum physics. 


17.6 Ehrenfest’s theorem 
En Hz zz zz — zz zu u zu Ei GG > 


An important link between the classical and quantum pictures was provided by Ehrenfest 
who carried out a ‘short elementary calculation without approximations’ (Ehrenfest, 1927). 
In modern language, he showed that, according to quantum mechanics, the expectation 
value of the time derivative of the momentum is equal to the expectation value of the 
negative gradient of the potential function, the quantum equivalent of Newton’s second law 
of motion. In only a page and a half, Ehrenfest quoted the result of his calculations, without 
giving a mathematical derivation, which he regarded as ‘elementary’. The argument goes 
as follows. 

For simplicity, consider the one-dimensional, time-dependent Schrödinger wave equation 
and its complex conjugate, as discussed in relation to (14.102): 


n aw ih aw 
Salm xt OY oe oF: 179) 
h? 9?%* ih dy* 
= 2 up alae ee 17. 
aem oe on ðt un 
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Following Schrédinger’s prescription, we define the mean values of the position and mo- 
mentum of the particle, (x) and (p), what are now referred to as their expectation values, 
by the relations 


= f pri: (17.56) 
oo ih f a 
-/ owas = -— vita, (17.57) 


where x is the scalar position operator and p the momentum operator —(ih/27) 0/dx. We 
now find the derivative of (x) with respect to time. 


A an [Gre 


Using the expressions (17.54) and (17.55) for dw/dt and dw* /dt respectively, we find 


ETE E OO = 
af eve = f vr ala. (17.59) 


Arme Joo Dt 





ŽE) dx. (17.58) 








Now, 


Pov) _ u, aw 
x 


əx? Ox ax2 ’ 





(17.60) 


and so, substituting for x 9°%/dx? in (17.59), 


de, _ Äh (fe, ve in f>” ay 
$f vrwa= IME en] ax - w 




















Anm. J—oo ax? 20m dx 
(17.61) 
Ah L ð sO) oy* 1 L x ow 
-zf lv Ox Capati M g ma 
(17.62) 
The first integral on the right-hand side of (17.62) becomes 
ih BOW) | w 
7 [v a v| (17.63) 
TMe ox 06 





and this must be zero since the wavefunctions y and y* are zero at +00. The term 
—(ih/27)0/dx in the last integral of (17.62) is the momentum operator and so from 
(17.56) and (17.57) 


—=(p). (17.64) 


Thus, quantum mechanically, the mean, or expectation, value ofthe momentum is equal to 
the product ofthe mass ofthe electron times the mean, or expectation, value ofthe velocity. 
This is the exact equivalent ofthe definition of momentum in classic mechanics. Notice that 
Planck’s constant has disappeared from this expression, despite the fact that the momentum 
is defined purely quantum mechanically. 
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Let us now take the next time derivative of the mean value of x with respect to time. We 
follow exactly the same procedure as above. From (17.57), we find, 


d? (x) = th (3f „I 
Me = L. 7 [v | dx. (17.65) 





dr? 2n dx 


Carrying out the partial differentiation and then substituting for the terms in 0/d¢ using the 
pair of Schrödinger equations (17.54) and (17.55), we find 


d? (x) h? eT aew* aw „9 (Py © OV (x) 
e = dx *__ wy dx . 
me a2 ( on) Í. | əx? 9x ¥ ax ( ax? ) Í. ¥ ax ¥ 
(17.66) 
Using the fact that y and w* tend to zero at +00, after a bit of manipulation of the partial 
derivatives, the first integral on the right-hand side of (17.66) is zero. The second term is 


just the expectation value of the gradient of the potential V(x), which is the expectation 
value of the force (f). Hence, 


B(x) _ dp) f> a (AV) 
mega = = vl 2) vars in. (17.67) 














This is Ehrenfest's theorem which states that the rate of change of the expectation value of 
the momentum is equal to the expectation value of the applied force. This is exactly the 
same statement as Newton’s second law of motion, but derived purely from the rules 
of quantum mechanics. Notice again that Planck’s constant has disappeared from the 
expression. Ehrenfest’s theorem provided physicists with a natural continuity between the 
quantum description of the action of forces and the world of classical physics. 

Ehrenfest’s brief paper, in which only the results of his calculations were presented, was 
important in furthering the cause of quantum mechanics among practising physicists. The 
fact that the equivalent of Newton’s second law of motion could be derived from a purely 
quantum mechanical set of operations made the theory much more acceptable to physicists, 
despite the fact that the classical and quantum pictures are built on completely different 
foundations. 


17.7 The Copenhagen interpretation of quantum mechanics 
Se a ee ee ee eae 


By 1927, most of the elements of non-relativistic quantum mechanics were in place, but 
their interpretation was the subject of hot debate among the principal contributors to the new 
discipline. At the centre of these debates was Bohr, who continued to ponder deeply about 
the meaning of the new vistas emerging from the brilliant analyses of Heisenberg, Jordan, 
Born, Schrödinger, Pauli, Wiener, von Neumann and many others. The invitation to visit 
Bohr in Copenhagen was a singular honour but also a test of the character of the individual 
to match Bohr’s inexhaustibility in maintaining an intellectual argument over many hours 
or days. Schrédinger was invited by Bohr to visit Copenhagen in September 1926 to discuss 
his brilliant papers on wave mechanics. The discussion would often last whole days, leaving 
Schrédinger in a state of exhaustion. The debates concerned the rationalisation of Bohr’s 
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insistence upon the concept of ‘quantum jumps’ and Schrödinger’s conviction that these 
were unphysical and should be replaced by his continuous wavefunctions. At one point, in 
exasperation, Schrédinger is reported to have stated, 


‘If one has to stick to this damned quantum jumping, then I regret ever having become 
involved in this thing.’ 


In conciliatory mood, Bohr responded 


‘But we others are very grateful to you that you were, since your work did so much to 
promote this theory.’ 


There is a large literature on the vicissitudes associated with the interpretation of quantum 
mechanics, many of the key protagonists holding different views. Einstein was particularly 
aggressive in his opposition to the probability interpretation of the wavefunction, his con- 
viction of the incompleteness of the new quantum mechanics being expressed by his 
well-known remark in a letter to Born of 4 December 1926, 


‘I, at any rate, am convinced that He (God) does not throw dice.’ 


17.7.1 Bohr and complementarity 


Bohr remained the godfather of the new discipline of quantum mechanics and agonised 
over the interpretation of the new scheme of things. He had held out against the reality of the 
wave-particle duality, particularly Einstein’s concept of light quanta, but eventually he con- 
ceded, following the decisive results of the Bothe-Geiger experiments and the consequent 
rejection of the Bohr-Kramers-Slater picture (see Sect. 10.2). For Bohr, the wave-particle 
duality for radiation was at the heart of the new conceptions of quantum mechanics — how 
was it possible for radiation to possess simultaneously both wave properties, as exhibited 
by the phenomena of interference and diffraction, and the particle properties found in the 
photoelectric and Compton effects? To resolve this apparent paradox, he introduced the 
concept of complementarity, a conception which was designed as what Jammer calls a ‘new 
logical instrument’ for the interpretation of classical and quantum phenomena. The words 
of Jammer give as close to a definition of complementarity as will be found in the literature: 


‘[Bohr] called it “complementarity”, denoting thereby the logical relation between two 
descriptions or sets of concepts which, though mutually exclusive, are nevertheless both 
necessary for an exhaustive description of the situation. In Heisenberg’s reciprocal un- 
certainty relations he saw a mathematical expression which defines the extent to which 
complementary notions may overlap, that is, may be applied simultaneously, but, of 
course, not rigorously. The uncertainty relations, Bohr contended, tell us the price we 
have to pay for violating the rigorous exclusion of notions, the price for applying to the 
description of a physical phenomenon two categories of notions which, strictly speaking, 
are contradictory to each other.’ (Jammer, 1989) 


Bohr went on to relate the notion of complementarity to the issue of measurement in 
quantum mechanics. In his interpretation, it was now impossible to separate out what was 
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being observed from the means by which it was observed. He contended that measuring 
instruments produce results which are expressed in classical terms and that by observing the 
system in different ways, complementary variables may be determined which however can 
only be determined and reconciled according to the limitations imposed by Heisenberg’s 
uncertainty principle. 

After a number of years in which he had published little on quantum mechanics, Bohr 
first described his new understanding at the 1927 International Congress of Physics held in 
Como to celebrate the centenary of the death of Alessandro Volta, who had been born and 
died there (Bohr, 1928). In his address The quantum postulate and the recent development of 
quantum theory, Bohr set out his concepts for the first time and these might be considered 
a primitive form of what was later to be known as the Copenhagen interpretation of 
quantum mechanics. Bohr’s arguments did not make a strong immediate impression upon 
his audience, but the emphasis upon the role of experiment in defining what was measurable 
at the elementary level was of lasting importance. 

Bohr gave no exact definition of the principle of complementary in his various expositions 
of the concept, and indeed later used the flexibility of its definition to include areas outside 
physics, as described in Pais’s biography of Bohr (Pais, 1991). Pauli sharpened up the 
concept to give a precise operational meaning to the principle (Pauli, 1933). He stated 


‘[two classical concepts — and not two modes of description — are] complementary if the 
applicability of one (for example, position coordinate) stands in the relation of exclusion 
to that of the other (for example, momentum).’ 


There ensued a debate among theorists, such as von Weizsäcker and Feyerabend about the 
exact meaning of Bohr’s complementarity, largely engendered by the vagueness of Bohr’s 
definition of the conception. For some theorists its very vagueness was a virtue in allowing 
flexible interpretations of the relation of non-commuting variables to measurement and 
observation. 


17.7.2 Dirac notation 


The first complete exposition of the formal foundations of quantum mechanics was given 
by Dirac in his classic text The Principles of Quantum Mechanics, the first edition of which 
was published in 1930 (Dirac, 1930a). Dirac’s book was an extraordinary achievement 
setting out in detail the mathematical foundations of quantum mechanics axiomatically. 
The remarkable feature of his exposition is the constant interplay between the formal 
structure of the statistical transformation theory and the need to account for observed 
physical phenomena. In the first edition, Dirac used the same notation he had adopted in 
his papers from 1925 to 1930. This exposition of the principles of quantum mechanics 
used the language and techniques of statistical transformation theory which he developed 
from his pioneering geometric-algebraic approach discussed in Chap. 14 and developed in 
Sect. 17.3. The various manipulations of operators and wavefunctions developed over the 
last eight chapters were the tools used in Dirac’s exposition of 1930. 

In 1939, Dirac invented the bra and ket notation which resulted in a significantly more 
elegant notation for the underlying algebraic structure of the theory (Dirac, 1939). This new 
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formalism was incorporated into the third edition (Dirac, 1947) and remained essentially 
unchanged in the subsequent fourth and definitive edition (Dirac, 1958). Let us summarise 
the translation from the notation of 1930 to the post-1939 version. This also completed 
the transformation of the theory from a more-or-less self-consistent set of assumptions and 
rules into the modern axiomatic exposition of quantum mechanics. Dirac’s notation works 
as follows: 


e The wavefunction W(x, t) incorporates all knowledge about the evolution of the system. 
In general, it is defined in an infinite dimensional function space, what is referred to as 
Hilbert space. W(x, t) can be thought of as a vector in this space. V(x, t) may consist of 
functions of the spatial coordinates and time as well as spin coordinates. It is referred to 
as the state of the system, or the state vector and is written |W), meaning ‘the state with 
state function V(x, t)’. In the time-independent case, lower case Greek symbols are used 
so that |y) means ‘the time-independent state with state function w(x)’. 

e In carrying out the evaluation of the integrals over complete sets of eigenfunctions, we 
need the complex conjugates of the wavefunctions in order to form integrals such as 


f px) W(x) dx . (17.68) 


Dirac introduced the notation (| to rewrite the integral (17.68) as 


i P(x) Wx)dx = (Ply), (17.69) 


so that |y) and (@| are shorthand notation for the wavefunction y and the complex 
conjugate of the wavefunction & respectively and the combination of the two, written as 
(|W), is an abbreviation for the integral (17.69). If the wavefunction w is normalised, 
(vw) = 1. In Dirac’s language of 1939, 
o |y) is called a ket vector, 
o (| is called a bra vector, the complex conjugate of y, 
o (|y) is called a Dirac bra(c)ket, or inner product of the state vectors. 

e It is conventional to write operators with a ‘hat’ so that the operator A is written A. Then, 
the state found by operating upon y with A is denoted | Ay). We can also define the bra 
vector corresponding to this state by analogy with the relation ¢*(x) > (d| so that 


(Ay =(Ayl. (17.70) 
The integral corresponding to 
x ~~ 
J d* Ay dx (17.71) 
—00 
can therefore be written in Dirac notation as 
oo 
1 go Ay dx = (dl Aly) . (17.72) 
—oo 


The quantity (| A |W) is called the matrix element associated with the state vectors |y) 
and |®). Because of the rules for forming the complex conjugates of the states and the 
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adjoint properties of 4, 


(ol Aly" = (Wl A Io) (17.73) 


where At is the adjoint operator associated with A. It follows that ATA = T, where T 
is the identity operator. 

The expectation value for an observable quantity represented by an operator A with the 
system in the state |y) can be written 


00 
(A) =f Wa) Av(x) dx = (Wl Aly) . (17.74) 
oo 
An integral part of the formalism is that the operator A should be Hermitian, or self- 


adjoint, so that the eigenvalues associated with the eigenfunctions are real. For Hermitian 
operators, A = A! and so 


(A)* = (yl ily" = (WIA Ty) = (WAT) = (4). (17.75) 
In the Dirac notation, the eigenvalue equation is written 
|A|v) = aly) . (17.76) 


Thus, |W) is the eigenstate corresponding to the operator A and a is the associated 
eigenvalue; w(x) is the corresponding eigenfunction of A. As examples of operators in 
the Dirac notation, the momentum operator is P = (h/2ni)d/dx in one dimension and 
the Hamiltonian operator is 


2 


H=T+V= V? + V(x), (17.77) 
822m, 





where the two terms correspond to the kinetic energy operator T and the scalar potential 
energy operator V . Thus, the eigenvalue equation for the stationary states of a system is 
given by the solution of the eigenfunction equation 


2 


= VwtV(x)\W= Ev, (17.78) 


Hv=Ty+ Vera 





which is the time-independent Schrédinger equation with energy eigenvalues E. 


Dirac’s great book repays close reading. The various complex pathways by which the 
new understandings were obtained are bypassed and the structure no longer depended upon 
concepts such as the correspondence principle. Just as Newtonian mechanics needed first the 
development of differential and integral calculus and then the Lagrangian and Hamiltonian 
approaches to perfect the mathematical structures, so Dirac’s book is an account of the 
non-commutative algebra of operators. On first encounter, it appears to be a somewhat 
formal mathematical approach to quantum mechanics, but close reading shows how Dirac’s 
thought was strongly governed at each stage by the need to account for a relatively small 
number of phenomena which contain the essence of what the algebra had to encompass. It 
is one of the great books of physics. 
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17.7.3 The Copenhagen interpretation 


In due course, the rules of quantum mechanics were hammered out into a self-consistent 
set of postulates which form the basis of what may be called the axiomatic approach to 
quantum mechanics. It is often referred to as the Copenhagen interpretation and is based 
upon the results of the many different analyses described in the last seven chapters. It has 
to be said that exactly what constitutes the Copenhagen interpretation is a matter of debate, 
but the following sets of rules, are fully consistent with what Bohr and his collaborators 
eventually agreed upon.* 


e The most complete knowledge we can have of a system is represented by the state vector 
|W). pi 
e To every observable A there corresponds a Hermitian operator A. The result of a mea- 
surement of A must be one of the eigenvalues of A. 
e If the eigenvalue a corresponds to the eigenstate |), then the probability of obtaining 
the result a when the system is in the state Y is |(@|W)|*. 
e As a result of the measurement of A in which the result a is obtained, the state of the 
system is changed to the corresponding eigenstate |). 
e Between measurements, the state vector |W) evolves with time according to the time- 
dependent Schrédinger equation 
ih ð 5 
m SH. (17.79) 


The consequences of these rules are profound. For example, we described in Sect. 17.2.2 
how the expectation value of any measurement is found for the observable A which has 
state vector |y) = >> jolo) in the time-independent case. The result of any particular 
measurement is that the system is found in the eigenstate |ġ;) with eigenvalue a; and that 
the probability of that value being obtained is |c;|? = |(#;|¥)|’. Hence, the expectation 
value, in the sense of the mean value of a large number of experiments, is (A) = )> ale 7. 
But, once the measurement is made, the system is in one particular state |®) ;. The effect 
of the measurement of the observable A is to force the system into one of the eigenstates 
of 4 , with the result that c; is now 1 and all the other c;s for which i #£ j are zero. This is 
a key feature of the way in which quantum mechanics works. This process of forcing the 
system into one of the possible state functions is referred to as the collapse or reduction of 
the wavefunction. 

This is one of the most remarkable results of the formalism of quantum mechanics. In 
terms of amplitudes, the theory is linear — it is only when an experiment is made that the 
system is forced into one of the eigenstates and the a priori probabilities of this happening 
are given by the square of the modulus of the probability amplitudes. 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:55:51 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.018 
Cambridge Books Online © Cambridge University Press, 2014 





The aftermath 





We have reached the goal I set myself when this journey began. From about 1927 onwards, 
the quantum theory in its modern guise for non-relativistic quantum mechanics was essen- 
tially complete, although there remained problems of interpretation which took a number 
of years to unravel — some of them are still hotly debated. But much of the apparatus was 
already in place and the subsequent developments changed completely the face of physics. 
Jammer (1989) summarises the achievement as follows: 


‘Since 1927, the development of quantum mechanics and its applications to molecular 
physics, to the solid state of matter, to liquids and gases, to statistical mechanics, as well as 
to nuclear physics, demonstrated the overwhelming generality of its methods and results. 
In fact, never has a physical theory given a key to the explanation and calculation of 
such a heterogeneous group of phenomena and reached such a perfect agreement with 
experience as has quantum mechanics.’ 


As noted by Mehra and Rechenberg (2001), the 1930s also saw the beginning of the com- 
partmentalisation of physics into separate quantum disciplines. Thus, during the 1930s, with 
the general acceptance and success of quantum mechanics, the quantum physicists began 
to specialise in disciplines such as atomic physics, molecular physics, solid state physics, 
including metal and semiconductor physics, condensed matter physics and low tempera- 
ture physics, while at high energies, nuclear, particle and cosmic ray physics developed as 
disciplines in their own right. Whereas the pioneers of quantum mechanics regarded the 
whole province of quantum physics as their domain, the various branches of quantum 
physics became fragmented into these specialisms, not so different from those encountered 
in any physics department today. 

The years up to the outbreak of the Second World War were a golden age of theoretical 
and experimental physics. Let us outline briefly some of the major achievements of theory, 
experiment, observation and the interpretation of these data. These topics are now part of the 
basic infrastructure of modern physics and the details of the experiments and calculations 
can be found in the standard textbooks. 


18.1 The development of theory 


368 


Theoretical quantum mechanics developed into the various subdisciplines mentioned above, 
but a key priority for theory and experiment was the development of a formalism which 
could encompass the quantisation of the electromagnetic field as well as the mechanics 
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and dynamics of particles. In fact, the first attempt to include the quantisation of the 
electromagnetic field into the scheme of matrix mechanics appeared in the last sections of 
the pioneering papers by Born and Jordan (1925b) and in the Three-Man Paper (Born et al., 
1926). In this approach, the electromagnetic waves were treated as a system of harmonic 
oscillators. The interaction of radiation and matter next appeared in Dirac’s treatment 
of Compton scattering using a relativistic version of his g-number approach to quantum 
mechanics (Dirac, 1926c). In his next assault on the quantum theory of radiation, Dirac 
extended the theory of the emission and absorption of radiation by requiring the g-numbers 
which characterise the radiation to be the Fourier components of the energy E, and its 
conjugate ‘phase’ 6, and that they should satisfy the commutation relations 


9-E, — E,9, = —. (18.1) 


His achievement in this paper was his demonstration that this procedure led to the determi- 
nation of the values of Einstein’s A and B coefficients (Dirac, 1927). According to Jammer 
(1989), Dirac’s procedure for quantising the electromagnetic field marked the beginning 
of quantum electrodynamics. The derivation of the Dirac equation and the predictions of 
the magnetic moment of the electron and positrons were spectacular achievements (Dirac, 
Dirac (1928a,b), see Sect. 16.6) but the electromagnetic field itself still had to be built 
into a consistent theory of relativistic quantum mechanics. In the succeeding years, Dirac, 
Heisenberg, Jordan, Klein and Pauli made major inroads into the proper formulation of 
quantum electrodynamics, step-by-step incorporating the features now familiar in their 
modern contexts. 

To mention only a few of the steps along the way, in 1928, Jordan and Pauli published 
their paper on the quantum electrodynamics of charge-free fields in which they extended 
Dirac’s formalism by introducing non-commuting g-numbers for the electromagnetic field 
and casting them in relativistically invariant form (Jordan and Pauli, 1928). Heisenberg 
and Pauli returned to the attack in early 1929. They had already introduced a relativistic 
Lagrangian in which the quantum field variables are given in a form similar to that used in 
classical mechanics. The problem they faced was that the formalism led to the conjugate 
momentum of the fields being identically zero. This was remedied by a ‘trick’ discovered 
by Heisenberg of adding an extra term to the Lagrangian which was multiplied by the small 
quantity ¢. This resolved the problem of the vanishing conjugate momentum and led to 
sensible results in the limit € — 0. The result of these endeavours was a long and important 
paper by Heisenberg and Pauli (1929), which begins with the words, 


‘So far it has not been possible to connect in quantum theory the mechanical and electrody- 
namic laws, electrostatic and magnetostatic interactions on the one hand, and interactions 
mediated by radiation on the other hand, in a unified point of view. In particular, one 
has not proceeded to take into account correctly the finite velocity of propagation of the 
actions due to electromagnetic forces.’ 


Their paper set out to provide this unified view and take account of retardation effects 
in radiation problems. The three chapters describe the ‘general methods’, the derivation 
of the ‘fundamental equations of the theory for electromagnetic and matter fields’ and 
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‘approximation methods for the integration of the equations and physical applications’. 
The new relativistic quantum field theory contained both Fermi—Dirac and Bose-Einstein 
statistics. 

Pauli and Heisenberg sought to continue along the lines of their paper of 1929 and were 
stimulated by the publication of Weyl’s Gruppentheorie und Quantenmechanik (1928) to 
generalise the group-theoretical methods of Weyl to the case of wave fields. This had the 
effect of simplifying the presentation of relativistic quantum field theory and of elimi- 
nating the ‘trick’ of the vanishing e which Heisenberg had introduced in their first paper 
(Heisenberg and Pauli, 1930). 

In classical mechanics, there is a close relation between the symmetry properties of a 
mechanical system and conservation laws.! A much more general approach to the problems 
of symmetry under different types of mathematical operation is provided by group theory. 
Already in his paper of June 1926, Heisenberg had made use of group theoretical results in 
his study of the many-body problem and resonances in quantum mechanics (Heisenberg, 
1926a). These ideas were to be taken very much further by the young Eugene Wigner. 
In 1927, he published an ambitious paper entitled Some consequences from Schrödinger 5 
theory for the term structures (Wigner, 1927) in which group-theoretical concepts were 
applied to Schrödinger’s wave equation. From these considerations, Wigner was able to 
derive the selection rules for azimuthal quantum numbers as well as the term structure 
and selection rules when atoms are subject to electric and magnetic fields. He concluded 
triumphantly that he been able to 


‘demonstrate that one can, by rather simple symmetry considerations with the Schrödinger 
equation, already explain an essential part of the purely qualitative spectroscopic experi- 
ence.’ 


In his subsequent development of the full power of group theory as applied to quantum 
mechanical problems, he was joined by his colleague John von Neumann. The full signif- 
icance of Wigner’s innovations can be appreciated from the remark of van den Waerden, 
himself a specialist in group theory, that 


“Wigner’s paper seems to be the first in which group theory was applied to [quantum] 
physics. In this paper, some rules of term zoology were deduced, but the spin was left out of 
account. In the papers of von Neumann and Wigner [Neumann and Wigner (1928a,b,c)], 
the whole apparatus of group characters and representations was put into action and the 
complete system of term zoology, including selection rules, intensity formulae [and] Stark 
effect was developed.’ (van der Waerden, 1960) 


For example, in the third paper by von Neumann and Wigner, the expression for the Lande 
g-factor was derived purely from group theory, specifically from the group-theoretical 
multiplication formula, which resulted in 


,,JG+D-Id+D+s6+1) 
g=1+ G+) A (18.2) 
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The works of Wigner and Neumann were incorporated into Weyl’s Gruppentheorie und 
Quantenmechanik (1928), Wey] stating that he had independently derived many of the key 
results. 

The group-theoretical approach opened up new ways of tackling the problem of the 
origin of the chemical bond. Hund had made use of symmetry arguments derived from 
the group-theoretical aspects of Heisenberg’s and Wigner’s papers of 1927 to interpret the 
binding of atoms in molecules (Hund, 1927c). These arguments provided a more visualisable 
approach to the combination of wavefunctions for the atoms of a molecule. A pure group- 
theoretical approach was adopted by Walter Heitler and Fritz London who investigated the 
forces between two neutral atoms using Heisenberg’s exchange integral. The result was an 
explanation for the homopolar covalent bonds found in molecular chemistry (Heitler and 
London, 1927). 

Following this advance, Heitler went on in his next paper to explore the full capabilities 
of group theory for quantum chemistry (Heitler, 1927). What he did is best explained in 
his own words. 


‘London did not join me in that: he thought is was too complicated. Wigner’s paper 
had appeared by that time, and I saw immediately that it could be useful for further 
development of the theory of chemical bonds, but London wanted to go on in his own 
more intuitive way... 

So I started to study group theory. I first read the book by Speiser ... Later on I read 
the papers by Schur and other people. The very nice thing was that the mathematicians 
had prepared group theory so well for the use of the physicists without knowing it that 
sometimes I could just copy, word for word, pages from a group-theory paper and use them 
for my purposes.’ (Heitler interview: the Archives for the History of Quantum Physics, 
1963) 


Among the achievements of this paper was his demonstration that the closed shells of an 
atom can result in no new splittings of the energy levels, but only a shift in the levels. As 
he remarked, 


‘[This conclusion is] solely responsible for the existence of the periodic system with 
homologous sequences, in which the elements exhibit homologous spectroscopic and 
chemical properties.’ 


These events reflect the fact that the theory of quantum mechanics was becoming increas- 
ingly mathematically sophisticated and involved mathematics which was often beyond the 
reach of the average physicist. Slater was among those disaffected by these developments, 
referring to the group theory approach as the ‘Gruppenpest’, the pest of group theory 
(Slater, 1975). But Slater did not simply complain, but converted the formalism of many- 
electron atoms into a viable scheme for the determination of the energy levels and structure 
of atoms. He realised that the atomic properties of atoms including their spin properties 
could be written entirely in terms of antisymmetric wavefunctions, thus explicitly includ- 
ing the Pauli exclusion principle (Slater, 1929). Combining these concepts with Douglas 
Hartree’s ‘self-consistent field’ approach to the determination of atomic structure (Hartree, 
1927a,b) led to many important results in the understanding of atoms and their spectra. For 
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example, he was able to recover Heitler’s result that filled shells make no contribution to 
the splitting of the energy levels of atoms. With some pride Slater recalled, 


‘As soon as this paper became known, it was obvious that a great many other physicists 
were as disgusted as I had been with the group-theoretical approach to the problem. As 
I heard later, there were remarks made such as “Slater has slain the ‘Gruppenpest’ ”. I 
believe that no other piece of work I had done was so universally popular.’ (Slater, 1975) 


Despite these somewhat reactionary remarks, group theory was there to stay in the armoury 
of weapons to be used in tackling problems in quantum mechanics. 

To complete the story of the understanding of spin up to 1940, the final piece of the jigsaw 
was the relation of the spin of the particles to their statistical properties. This connection 
was discovered in the studies of Markus Fierz who worked as an assistant to Pauli from 
1936 to 1940. Fierz analysed the properties of quantum field theories and, using the spinor 
calculus of van der Waerden (1932), found that 


‘Particles with integral spin must always satisfy Bose statistics and particles with half- 
integral spin Fermi statistics.’ (Fierz, 1939) 


These results were consolidated in joint papers with Pauli in 1939 and 1940 (Fierz and 
Pauli, 1939; Pauli and Fierz, 1940). An even more general approach which gave the same 
result was developed by Frederick Joseph Belinfante who introduced undons. These objects 
could describe both integral and half-integral spin particles (Pauli and Belinfante, 1940). 


18.2 The theory of quantum tunnelling 


One of the earliest triumphs of quantum mechanics was the theory of quantum mechanical 
tunnelling as applied to molecules and to the emission of «-particles in radioactive decays.” 
Friedrich Hund’s objective was to understand the nature of the stationary states of diatomic 
molecules. He modelled the potential well experienced by the electrons in molecules by a 
double-well potential (Fig. 18.1) in order to work out the eigenstates of molecules (Hund, 
1927a,b,d). He found that the superposition of the even ground state and the odd first excited 
state shown in Fig. 18.15 resulted in a non-stationary state in which the electron oscillated 
back and forth between the two potential wells, the period T of the oscillation being 
T hv 


V 
N 18. 
T yP hy’ Ges) 


where t = 1/v is the period of oscillation of the electron in one of the potential wells when 
they are widely separated and V is the height of the potential barrier. Notice the exponential 
dependence of the oscillation frequency upon the height of the potential barrier. 

In the same year, Lothar Nordheim published a paper on barrier penetration associated 
with the thermionic emission of electrons from a heated metal surface (Nordheim, 1927). 
In his calculations, he introduced the concept of a rectangular potential barrier, a picture 
now familiar in every quantum mechanics textbook (Fig. 18.2). 
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Examples of the double potential wells considered by Hund in his determination of the eigenstates of diatomic 
molecules. (a) An asymmetric double potential showing the first five eigenfunctions. For states W and W3, transitions 
through the potential barrier are not allowed according to classical physics. (b) A symmetric double potential well with 
a finite potential barrier, also showing the first six eigenfunctions. For all the states shown, transitions through the 
potential barrier are not allowed classically (Hund, 1927a). 


L 


The rectangular potential barrier used by Nordheim to estimate the barrier penetration rate of hot electrons within a 
metal surface (Nordheim, 1927). 


Perhaps the most famous example of the application of barrier penetration was to the 
process of a decay, the first application of quantum mechanics to the atomic nucleus. When 
George Gamow arrived in Göttingen from the Soviet Union in 1927, he read Rutherford’s 
paper of 1927 on the problem of understanding the a decay in thorium C’, or polonium-212 
(7!*Pu). Geiger’s w-scattering experiments had shown that the height of the electrostatic 
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(a) The one-dimensional model used by Gamow in his paper on barrier penetration by a-particles, showing the 
amplitude of the oscillating wavefunction within the nucleus, decaying through the potential barrier and then 
propagating as a wave outside the potential barrier (Gamow, 1928). (b) The Geiger—Nuttall law showing the relation 
between the half-life of œ decay nuclei and the energies of the emitted a-particles. Note the enormous range of 
half-lives shown on a logarithmic scale on the ordinate and the small range of energies shown on a linear scale on the 
abscissa. The measurements follow closely the expectation of barrier penetration theory, specifically the expression 
(18.4) (Courtesy of Creative Commons). 


potential barrier within which the nucleons were confined was at least 8.57 MeV, and yet 
the energies of the a-particles observed in the œ decay of thorium C’ were less than half 
this value, 4.2 MeV. Gamow realised that this was an example of barrier penetration in 
quantum mechanics. The nuclear potential could be modelled as a deep rectangular poten- 
tial well, as illustrated in Fig. 18.3a. Then, according to the quantum calculation for barrier 
penetration, although the barrier is impenetrable according to classical physics, there is a 
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The apparatus with which Rutherford demonstrated the nuclear disintegration of nitrogen atoms (Rutherford, 1919, 
1920). 


finite probability that the w-particles can reach the other side because of the wave properties 
of the particle. This is illustrated by the diagram from Gamow’s paper (Fig. 18.3a) which 
shows the amplitude of the wavefunction on both sides of the barrier and within it, another 
diagram which appears in all the standard textbooks. In fact, Gamow and independently 
Ronald Gurney and Edward Condon almost simultaneously solved Schrédinger’s equation 
for the nuclear potential shown in Fig. 18.3a and derived a relationship between the de- 
cay constant A of the nucleus against a-particle decay and the energy of the a-particle 
(Gamow, 1928; Gurney and Condon, 1928, 1929). This theory of æ decay could account 
naturally for the very narrow range of energies of the a-particle and the enormous range 
of decay constants, as found by Hans Geiger and John Nuttall in 1911. The law can be 
written 


Z 
Ina = —A—~=+B, 18.4 
JE a 
where A = In2/(half-life), Z is the atomic number, E the total kinetic energy of the 
a-particle and the residual nucleus, and A and B are constants (Geiger and Nuttall, 1911). 
A modern version of the Geiger—Nuttall law is shown in Fig. 18.35. 


18.3 The splitting of the atom and the Cockcroft and 


Walton experiment 
DEIER) 


During the last years of his tenure of the Chair of Physics at Manchester University, 
Rutherford carried out a key experiment in which he bombarded nitrogen atoms with the 
a-particles produced in the radioactive decays of radium-C (Fig. 18.4). The surprising 
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result was the detection of the emission of energetic particles, which, after a careful set 
of further experiments, Rutherford concluded had to be protons liberated in collisions 
between the «-particle and the nuclei of nitrogen atoms. As he wrote in the conclusion of his 


paper, 


‘From the results so far obtained it is difficult to avoid the conclusion that the long-range 
atoms arising from the collision of alpha particles with nitrogen are not nitrogen atoms 
but probably atoms of hydrogen, or atoms of mass 2. If this be the case, we must conclude 
that the nitrogen atom is disintegrated under the intense forces developed in a close 
collision with a swift alpha particle, and that the hydrogen atom which is liberated formed 
a constituent part of the nitrogen nucleus.’ (Rutherford, 1919) 


We now understand that the origin of the fast protons was the nuclear interaction 
WN+to> “O+p. (18.5) 


Further experiments were undertaken by Rutherford and James Chadwick who found 
similar fast protons when other atoms were bombarded by «-particles. A problem was that 
the energies of the a-particles were limited to those available from naturally occurring 
radioactive nuclides. Ideally, a more controlled beam of incident particles was required. 
The problem was that, as can be seen from Fig. 18.35, the energies of the particles would 
have to be in the MeV energy range and it was a serious technological challenge to create 
such electrostatic potentials in the laboratory. 

Gamow realised that his theory of barrier penetration could account for the penetration 
of a-particles into nitrogen nuclei in Rutherford’s experiments. He explained the theory 
in a manuscript sent to Rutherford and his colleague John Cockcroft in December 1929. 
Cockcroft repeated Gamow’s calculations and showed that, because of the process of barrier 
penetration, protons accelerated to only 300 keV could penetrate a boron nucleus with about 
0.6% probability. Cockcroft inferred that an accelerating electric potential of only 300 keV 
would be sufficient to penetrate the boron nucleus and induce nuclear transmutations. 
By 1932, after a great deal of effort, Cockcroft and Ernest Walton had developed the 
technology of electrostatic particle acceleration to the extent that potentials of 700 keV 
could be sustained to accelerate protons to these energies (Fig. 18.5). They succeeded in 
inducing the first artificial nuclear disintegrations by bombarding lithium nuclei with high 
energy protons (Cockcroft and Walton, 1932). The process involved was 


"Li + p-—*He + *He. (18.6) 


The energies of the accelerated protons were precisely known, as were the rest masses 
of the lithium and helium atoms. The kinetic energies of the helium nuclei ejected in 
the interaction (18.6) could be measured. As a result, this nuclear interaction provided 
the first direct experimental test of Einstein’s mass-energy relation E = mec? — precise 
agreement was found with Einstein’s prediction. This experiment marked the beginning 
of experimental high energy physics in which particles are accelerated to high energies 
and used as probes of the structure of the nucleus and as tools for the discovery of new 
particles. 
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Y Fig. 18.5 | The apparatus with which Cockcroft and Walton artificially disintegrated lithium nuclei (Cockcroft and Walton, 1932). 
Walton is sitting inside the little tent, observing the decay products on a luminescent screen. Cockcroft is on the left. 


18.4 Discovery of the neutron 
SC“ SSCs 


Rutherford’s Bakerian lecture to the Royal Society of London in 1920 provides a vivid 
picture of the state of knowledge of the properties of the atomic nucleus at that time 
(Rutherford, 1920). Atomic nuclei have masses about two or more times that which can 
be attributed to positively charged protons. The commonly held explanation was that the 
nucleus was composed of electrons and protons, the ‘inner’ electrons neutralising the extra 
protons. The fact that certain nuclei ejected electrons in radioactive 6 decays supported 
this point of view. Rutherford speculated in his review that the neutral mass in the nucleus 
might be in the form of some new type of particle, similar to the proton but with no electric 
charge. As he wrote, 


‘. .. it may be possible for an electron to combine much more closely with the H-nucleus 
[than is the case in the ordinary hydrogen atom] . . . It is the intention of the writer to test 
[this idea] . . . The existence of such atoms seems almost necessary to explain the building 
up of the heavy elements.’ (Rutherford, 1920) 


During the 1920s Rutherford and his colleagues, particularly James Chadwick, made a 
number of unsuccessful attempts to find evidence for these particles, which became known 
as neutrons. Little attention was paid to Rutherford’s proposal. 
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The apparatus with which Chadwick discovered the neutron (Chadwick, 1932). 


In 1930, Walther Bothe and Herbert Becker discovered that very penetrating radiation was 
emitted when light elements such as beryllium were bombarded by a-particles (Bothe and 
Becker, 1930). Because the penetrating particles did not cause ionisation, they postulated 
that the neutral particles were high energy y-rays. In 1932, Irene Joliot-Curie and her 
husband Frédéric Joliot in France carried out a similar series of experiments in which the 
penetrating neutral radiation hit a block of paraffin wax, which was found to emit energetic 
protons. If the energetic protons were produced by Compton scattering, the y-rays would 
have had to be of very high energy, ~ 50 MeV (Curie and Joliot, 1932). Chadwick guessed 
that the penetrating radiation was rather a flux of the elusive neutrons. Within a matter of 
weeks, he performed the key experiment in which «-particles bombarded a beryllium target, 
releasing neutrons which then collided with a block of paraffin wax (Fig. 18.6). Energetic 
protons were emitted in collisions between the neutrons and the protons in the paraffin wax 
which were detected in an ionisation chamber, enabling the mass of the invisible neutron 
to be estimated. It was inferred that the neutral particles had masses roughly the same as 
that of the proton (Chadwick, 1932). He interpreted the nuclear interaction as the following 
process, 


a+ *Be > "C+ n. (18.7) 


This was the discovery of the neutron. 


18.5 Discovery of nuclear fission 
E] 


The discovery of the neutron had immediate implications for experimental nuclear physics. 
Unlike the electron or the a-particle, the neutron is electrically neutral and so could penetrate 
the Coulomb barrier of the nucleus. The discipline of nuclear physics was transformed since 
heavy nuclei such as uranium could be bombarded with neutrons, resulting in the formation 
of new isotopes. Such experiments were initiated by Fermi and his colleagues in Rome 
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in 1934. They believed that they had demonstrated the formation of a new element with 
atomic number 94 (Fermi et al., 1934), but the result was viewed with some scepticism. 

The experiments were repeated by Otto Hahn, Lise Meitner and Fritz Strassmann. In 
1938, following the Anschluss in which she lost her citizenship, Meitner fled to Sweden 
and continued her collaboration with Hahn by mail. During this correspondence, Hahn 
informed Meitner of his discovery of traces of barium when uranium was bombarded with 
neutrons. This came as a complete surprise since barium has only 40% the atomic weight 
of uranium. Meitner soon convinced herself and Hahn that the barium resulted from what 
became known as the nuclear fission of the uranium nuclei. The results were published by 
Hahn and Strassmann (1939). Meitner and her nephew Otto Frisch, who was also working 
in Sweden, published the results of their calculations that a new type of nuclear reaction 
had been observed in Hahn and Strassmann’s experiments (Meitner and Frisch, 1939). 

Leo Szilard was well aware of the significance of these experiments for nuclear energy 
generation. In 1933, following Chadwick’s discovery of the neutron, Szilard had realised 
that a self-sustaining nuclear chain reaction would be possible if the neutrons liberated in 
the types of interaction involved in the Cockcroft and Walton experiment could be used 
to initiate further nuclear interactions. He filed patents for this concept and also carried 
out unsuccessful experiments in which light elements were bombarded with neutrons to 
demonstrate the effect. As soon as the results of Hahn and Strassmann’s experiments were 
published, he immediately realised that this provided a route to a nuclear chain reaction, 
both for the generation of nuclear power and for the creation of nuclear weapons. He urged 
restraint in the publication of these results because of the impending war, but Joliot and his 
colleagues in Paris did not hesitate. The theory of nuclear chain reactions was published 
by both groups in 1939 (von Halban et al., 1939; Szilard and Zinn, 1939). The details of 
this story and the subsequent development of nuclear weapons is vividly told in Rhodes’ 
classic book The Making of the Atomic Bomb (1986). 


18.6 Pauli, the neutrino and Fermi’s theory of weak interactions 
——————————————————— ————————————————— —””””” a ea 


Unlike the process of œ decay in which the decay of a radioactive nuclide results in a well- 
defined energy for the emitted &-particle, the 6 decay process results in a broad spectrum 
of electron energies. An example of the continuous energy spectrum of electrons found 
in the decay of radium-E (7!°Bi) is shown in Fig. 18.7 (Neary, 1940). There is an upper 
limit to the energies of the emitted electrons of just over 1 MeV, but the spread of energies 
extends to less than 4% of this value, the maximum occurring at just less than 300 keV and 
the average energy amounting to 390 keV. During the 1920s, there was an ongoing debate 
about whether or not the broad continuum electron energy spectrum could be attributed to 
what was termed ‘ordinary’ processes, meaning that the electrons were created with a single 
energy which was then redistributed by ‘ordinary’ processes such a Compton scattering. 
After two years of challenging experiments, Charles Ellis and William Alfred Wooster 
completed calorimetric experiments in which they showed that the average energy deposited 
in their calorimeter was about 350 keV per disintegration, rather than about 1 MeV as 
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An example of the energy spectrum of electrons emitted in the 8-decay process. This spectrum shows the continuous 
electron energy spectrum found in the radioactive decay of radium-E (7"°Bi) (Neary, 1940). 


might be expected if all the energy was injected with maximum energy of 1MeV and then 
dissipated by ‘ordinary’ processes (Ellis and Wooster, 1927). The experiment indicated that 
the measured electron energy spectrum was indeed the intrinsic energy spectrum of the 
B-decay process. The problem was that the process seemed to violate conservation of energy. 
Furthermore, with the understanding of the quantum mechanical rules for the addition of 
angular momentum, the hyperfine splitting of atomic lines could be used to determine the 
magnetic moments and hence the spin of nuclei. The process of 8 decay seemed to involve 
a violation of the law of conservation of angular momentum at the nuclear level. Thus, the 
conservation of both energy and angular momentum were in jeopardy. 

For some time, Bohr returned to his old concern about the validity of the law of con- 
servation of energy at the atomic level. In 1930, in desperation, Pauli suggested that the 
problem might be solved by invoking the existence of a neutral particle which he called a 
‘neutron’. Note that at this date, the only known ‘subatomic particles’ were the proton, the 
electron and the photon. Pauli’s radical proposal was contained in an impassioned letter to 
his expert colleagues working on radioactivity at their meeting in Tiibingen.* 


‘Dear Radioactive Ladies and Gentlemen, 

I have come to a desperate way out regarding the “wrong” statistics of the N- and 
6Li nuclei, as well as the continuous B-spectrum, in order save the “alternation law” of 
statistics and the energy law. To wit, the possibility that there could exist in the nucleus 
electrically neutral particles, which I will call neutrons, which have spin 1/2 and satisfy 
the exclusion principle and which are further distinct from light-quanta in that they do 
not move with light velocity. The mass of the neutrons should be of the same order of 
magnitude as the electron mass and in any case not larger than 0.01 times the proton 
mass. ... The continuous ß-spectrum would then become understandable from the as- 
sumption that in $-decay a neutron is emitted along with the electron in such a way that 
the sum of the energies of the neutron and the electron is constant. ... 
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For the time being I dare not publish anything about this idea and address myself 
confidentially to you, dear radioactive ones, with the question how it would be with the 
experimental proof of such a neutron, if it were to have a penetrating power equal to or 
about ten times larger than a y-ray. 

I admit that my way out may not seem very probable a priori since one would probably 
have seen the neutrons a long time ago if they exist. But only who dares wins and the 
seriousness of the situation concerning the continuous ß-spectrum, is illuminated by my 
honoured predecessor, Mr. Debye, who recently said to me in Brussels: “Oh, it is best not 
to think about this at all, as with new taxes”. One must therefore discuss seriously every 
road to salvation. Thus, dear radioactive ones, examine and judge. 

Unfortunately, I cannot appear personally in Tübingen since a ball which takes 
place in Ziirich the night of sixth to seventh of December makes my presence here 
indispensable. ... 

Your most humble servant, 

W. Pauli’ 


In 1932, Chadwick’s discovery of the neutron, meaning the neutral partner of the proton, 
changed the picture. In the following year, Fermi suggested that Pauli’s ‘neutron’ might be 
better called a neutrino and that usage was established from then on. In the following year, 
Fermi published his theory of weak interactions and £ decay (Fermi, 1934). In his famous 
paper, he treated the process by analogy with the process of the emission of radiation 
according to the rapidly developing theory of quantum electrodynamics. Neutrinos have 
a very small cross-section for interaction with matter and it was not until 1956 that they 
were detected experimentally by Frederick Reines and Clyde Cowan who used a fission 
reactor as a source of neutrinos (Reines and Cowan, 1956). The discovery was made only 
two and a half years before Pauli’s death. His response on receiving the news was sent in a 
congratulatory telegram: 


‘Thanks for message. Everything comes to him who knows how to wait. Pauli.’ 


18.7 Cosmic rays and the discovery of elementary particles 
SS eee 


From the 1930s until the early 1950s, the cosmic radiation provided a natural source of 
very high energy particles which were energetic enough to penetrate the nucleus. This was 
the principal technique by which new particles were discovered until the early 1950s. As 
already recounted in Sect. 16.7.2, in 1930 Millikan and Anderson used an electromagnet 
10 times stronger than that used by Skobeltsyn to study the tracks of particles passing 
through a cloud chamber. Anderson observed curved tracks identical to those of electrons 
but corresponding to particles with positive electric charge (Anderson, 1932). The discovery 
of the positron was confirmed by Patrick Blackett and Giuseppe Occhialini in 1933 using 
an improved technique in which the cloud chamber was only triggered after it was certain 
that a cosmic ray had passed through (Blackett and Occhialini, 1933). They obtained many 
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excellent photographs of the positive electrons, on many occasions showers containing 
equal numbers of positive and negative electrons created by cosmic ray interactions. 

There were more surprises in store, however. Anderson noted that there were often much 
more penetrating positive and negative particle tracks in the cloud chamber pictures. These 
particles displayed little evidence of interaction with the gas in the chamber. By 1936, 
Anderson and Seth Neddermeyer were sufficiently confident of their results to announce 
the discovery of particles with mass intermediate between that of the electron and the 
proton (Anderson and Neddermeyer, 1936). These mesotrons had mass between about 50 
and 400 times the mass of the electron. This discovery coincided rather conveniently with 
the theoretical prediction by Hideki Yukawa concerning the nature of the strong force 
which binds neutrons and protons together in the nucleus. According to Yukawa’s theory, 
the strong short-range force could be understood in terms of the exchange of particles about 
250 times as massive as the electron (Yukawa, 1935). In fact, the particles discovered by 
Anderson and Neddermeyer, nowadays known as muons, are not the particles which bind 
nuclei together. The identification was somewhat unsatisfactory because the mesotrons 
showed little interaction with nuclei in the chamber, whereas the exchange particle was 
expected to show a strong interaction with nuclei. 

The same procedures were used immediately after the Second World War by George 
Rochester and Clifford Butler who constructed a new cloud chamber to use with a large 
electromagnet obtained by Blackett before the War. In 1947 they reported the discovery 
of two cases of particle tracks in the form of ‘V’s with apparently no incoming particle 
(Rochester and Bulter, 1947). They correctly suggested that the Vs resulted from the 
spontaneous decay of an unknown particle, the mass of which could be estimated from the 
decay products. Both had mass about half that of the proton. To obtain higher fluxes of 
cosmic radiation, the experiments were repeated at much higher altitudes. Two years later, 
the experiments were carried out by Blackett’s group working at the Pic du Midi Observatory 
in the Pyrenees and by Anderson and Cowan on White Mountain in California. Many more 
examples of Vs were found and this class of particle became known as strange particles. 
Both neutral and charged strange particles were discovered. Most of them had mass about 
half that of the proton and are what are now referred to as charged and neutral kaons (Kt, 
K~, K®). There were a few examples, however, of neutral particles with mass greater than 
the mass of the proton — these are now known as lambda particles (A). What puzzled the 
physicists was their long lifetimes — 1078 and 10~!° s, which is many orders of magnitude 
greater than the time-scale associated with the strong interactions. 

Meanwhile another powerful tool for the study of particle collisions and interactions had 
been developed by Cecil Powell at Bristol University. Photographic plates had played a 
key role in the discovery of X-rays and radioactivity in the 1890s. Powell, in collaboration 
with the Ilford company, developed special ‘nuclear’ emulsions which were sufficiently 
sensitive to register the tracks of protons, electrons and all the other types of charged 
particle which had been discovered. Powell and his colleagues mastered the techniques of 
producing thick layers of emulsion by stacking layer upon layer of emulsion, resulting in a 
three-dimensional picture of the interactions taking place in the emulsion. Among the first 
discoveries using this high precision technique was that of the pion () in 1947, which was 
the particle predicted by Yukawa in 1936 (Lattes et al., 1947). 
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By 1953, accelerator technology had developed to the point where energies comparable 
to those available in the cosmic rays could be produced in the laboratory with known 
energies and directed precisely onto the chosen target. After about 1953, the future of high 
energy physics lay in the accelerator laboratory rather than in the use of cosmic rays. 


18.8 Astrophysical applications> 
A) 


The solution of the problem of energy generation in the Sun was one of the first fruits 
of the discovery of quantum mechanics. In his early papers Eddington had advocated the 
annihilation of matter as an inexhaustible source of energy for the stars but in 1920 he 
realised that, although there was no known mechanism by which nuclear energy could 
be released, at least energetically this provided an attractive means of powering the stars. 
In a remarkably prescient paragraph of his Presidential Address to the Mathematical and 
Physics Section of the British Association for the Advancement of Science at its Annual 
Meeting, held in Cardiff, he stated (Eddington, 1920): 


‘Certain physical investigations in the past year...make it probable to my mind that 
some portion of this sub-atomic energy is actually being set free in the stars. F. W. Aston’s 
experiments seem to leave no room for doubt that all the elements are constituted out 
of hydrogen atoms bound together with negative electrons. The nucleus of the helium 
atom, for example, consists of 4 hydrogen atoms bound with two electrons. But Aston has 
further shown conclusively that the mass of the helium atom is less than the sum of the 
masses of the 4 hydrogen atoms which enter into it; and in this at any rate the chemists 
agree with him. There is a loss of mass in the synthesis amounting to about 1 part in 120, 
the atomic weight of hydrogen being 1.008 and that of helium 4. ... Now mass cannot be 
annihilated, and the deficit can only represent the mass of the electrical energy set free 
in the transmutation. We can therefore at once calculate the quantity of energy liberated 
when helium is made out of hydrogen. If 5 per cent of the star’s mass consists initially of 
hydrogen atoms, which are gradually being combined to form more complex elements, 
the total heat liberated will more than suffice for our demands, and we need look no 
further for the source of a star’s energy.’ 


Eddington had the good fortune to be working at the Observatories at Cambridge University, 
only a twenty minute walk from the Cavendish Laboratory where Francis Aston was carrying 
out his precise measurements of atomic and isotopic masses. At that time, nuclear energy 
generation could be no more than a hypothesis, but Eddington had indeed hit upon the 
correct solution for the energy source of the Sun. The beauty of Eddington’s argument was 
that it did not depend upon the precise nature of the nucleus, but only upon the conservation 
of energy and the mass-energy relation E = mc’. 

The problem was that, even at the high temperatures of stellar interiors, the Coulomb 
repulsion between protons and nuclei is so great that, according to classical physics, protons 
could not penetrate the nucleus and so this energy source could not be tapped. The solution 
of this problem had to await the theory of quantum mechanical tunnelling (Gamow, 1928; 
Gurney and Condon, 1928). One year later, Robert Atkinson and Fritz Houtermans applied 
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Gamow’s theory to the physics of nuclear reactions in the hot central regions of stars 
(Atkinson and Houtermans, 1929). By considering the process of barrier penetration by 
a Maxwellian distribution of protons, they established two key features of the process 
of nuclear energy generation in stars. First, the most effective energy sources involve 
interactions with nuclei of small electric charge since the Coulomb barriers are lower 
than for nuclei with large charges. Second, the particles which can penetrate the Coulomb 
barriers are those few particles in the high energy tail of the Maxwellian distribution. As 
a result, nuclear reactions can take place at temperatures which are considerably lower 
than might have been expected. These ideas also suggested why the luminosity of the stars 
should be a sensitive function of temperature. As the temperature increases, the rate of 
barrier penetration increases exponentially and so hotter, more massive stars should be 
more luminous than less massive stars. 

Atkinson’s objective was to account for the origin of the chemical elements by the 
successive addition of protons to nuclei. He argued that the process of forming helium by 
the combination of four protons was very unlikely and proposed instead that helium could 
be formed by the successive addition of protons to heavier nuclei which, when they became 
too massive for nuclear stability, would eject a-particles and so create helium (Atkinson, 
1931a,b). This proposal was the precursor of the carbon-nitrogen-oxygen (CNO) cycle, 
which was discovered independently by Carl von Weizsäcker and Hans Bethe in 1938 
(Weizsäcker, 1937, 1938; Bethe, 1939). In this cycle, carbon acts as a catalyst for the 
formation of helium through the successive addition of protons accompanied by two B* 
decays as follows: 


12C +p > BN +y; PN —> "CHE +v; CH > MN +y 
“N+p> O+y; Bo "Net +v; °N+p—> *He+ C. 


In the meantime, it had become possible to make estimates of the reaction rates for the 
simplest nuclear reaction, the combination of pairs of protons to form deuterium nuclei 
which can then combine with other deuterons to form *He and “He. The first calculations 
were carried out by Atkinson in 1936 (Atkinson, 1936) and were much refined in 1938 by 
Bethe and Critchfield who combined Fermi’s theory of weak interactions with Gamow’s 
theory of barrier penetration (Bethe and Critchfield, 1938). The principal series of reactions 
in the proton-proton (or p-p) chain are as follows: 


p+p> ”H+e!+w *H+p—> °He+y 
3He + *He > *He + 2p. 


The crucial first reaction in the chain involves a weak interaction in which a positron and 
neutrino are released in what may be thought of as the transformation of one of the protons 
into a neutron. This reaction accounts for most of the energy release in the p-p chain but 
it has never been measured experimentally at the energies of interest for nucleosynthesis 
in the Sun. Bethe and Critchfield showed that this series of reactions could account for 
the luminosity of the Sun. In addition, they found that the rate of energy production € of 
the p-p chain depends upon the central temperature of the star as e x T4. In 1939, Bethe 
worked out the corresponding energy production rate for the CNO cycle and found a very 
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much stronger dependence, £ « T!” (Bethe, 1939). He concluded that the CNO cycle was 
dominant in massive stars while the p-p chain was the principal energy source for stars with 
mass less than roughly the mass of the Sun, Mo. 

The theory of white dwarfs was one of the first triumphs of the new quantum theory of 
statistical mechanics as applied to astrophysics. In 1926, Ralph Fowler used Fermi—Dirac 
statistics to derive the equation of state of a cold degenerate electron gas (Fowler, 1926) 
and found the important result 


oe B27) he ( p yo 


5 Me \ HeMu 





(18.8) 


where u. is the mean molecular weight of the material of the star per electron and m, is 
the unified atomic mass constant. The important aspect of this equation of state is that it 
is independent of temperature and so the structure of white dwarfs can be derived directly 
from the Lane-Emden equation of stellar structure.° Unlike main sequence stars, in which 
pressure support is provided by the thermal pressure of hot gas, the white dwarfs are 
supported by electron degeneracy pressure. The source of their luminosity is the internal 
thermal energy with which they were endowed on formation. According to Fowler’s picture, 
the white dwarfs simply radiate away their internal thermal energies and end up as inert 
cold stars with all the nuclei and electrons in their ground states. 

In 1929, Wilhelm Anderson showed that the degenerate electrons in the centres of white 
dwarfs with mass roughly that of the Sun become relativistic (Anderson, 1929). In the 
extreme relativistic limit, the equation of state of the degenerate electron gas becomes 


_ (327)'Phc p 4/3 
. 4 KeMu 





(18.9) 


Once again, the result is independent of temperature but the change in the dependence of 
pressure upon density from p « p>’? to p œ pt’? has profound implications. Anderson 
and Edmund Stoner realised that the consequence was that there do not exist equilibrium 
configurations for degenerate stars with mass greater than about the mass of the Sun 
(Anderson, 1929; Stoner, 1929). The most famous analysis of this result was carried out by 
Subrahmanyan Chandrasekhar, who had begun working on this problem before he arrived 
to take up a fellowship at Trinity College, Cambridge in 1930. He found the crucial result 
that, in the extreme relativistic limit, there is an upper limit to the mass of stable white 
dwarfs, 





ii (Br) G 2.01824 5.836 
Ch = 


5 G a: = mr Mo. (18.10) 
This mass is known as the Chandrasekhar mass (Chandrasekhar, 1931). The critical mass 
depends upon the chemical composition of the material of the star through the value 
of ie, the mean molecular weight of the stellar material per electron. Other than that, 
the Chandrasekhar mass only depends upon fundamental constants. Since u. ~% 2 for the 
material of compact stars, the Chandrasekhar mass is usually quoted as Mcn = 1.46Mo. 
The cause of the instability is that, in the extreme relativistic limit, both the internal 
thermal energy Un and the gravitational potential energy Usray of the star depend upon the 
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radius in the same way, Um = (1/2)U gray X R=!. Now, the gravitational potential energy 
is proportional to M? whereas the thermal energy is proportional to the mass of the star 
and so, for massive enough stars, the gravitational energy term dominates causing collapse, 
which cannot be stabilised by the pressure of the degenerate gas since the two energies 
always depend upon the radius in the same way. The inference is that there is nothing to 
prevent degenerate stars more massive than Mcn from collapsing to very high densities 
indeed and possibly to a state of complete gravitational collapse. 

Quite independently in 1932, Lev Landau had come to the conclusion that gravitational 
collapse to a singularity should be taken seriously (Landau, 1932) and in 1938 Robert 
Oppenheimer and Hartland Snyder gave the first general relativistic analysis of what would 
be observed in the final stages of the gravitational collapse of a pressureless sphere (Op- 
penheimer and Snyder, 1939). In their paper, they described the key observed features of 
what are now termed black holes. 

Following Chadwick’s discovery of the neutron in 1932 (Chadwick, 1932), the first 
mention of the possibility of neutron stars appears as the famous ‘Additional Remark’ to 
a paper by Walter Baade and Fritz Zwicky of 1934 (Baade and Zwicky, 1934b). In that 
year, they published two papers on the energetics of what they termed ‘super-novae’. In 
their first paper, Baade and Zwicky proposed that the population of novae consists of two 
types, the ordinary novae, which are relatively common phenomena and which had been 
used by Lundmark as distance indicators for spiral nebulae, and the super-novae which 
are very rare but very energetic indeed (Baade and Zwicky, 1934a). In their second paper, 
they suggested that such events might be the sources of the cosmic rays, discovered by 
Victor Hess in 1912 (Hess, 1913). Both proposals are remarkably close to the truth. As an 
addendum to the second paper (Baade and Zwicky, 1934b), they wrote 


“With all reserve we advance the view that a super-nova represents the transition of an 
ordinary star into a neutron star, consisting mainly of neutrons. Such a star may possess 
a very small radius and an extremely high density. As neutrons can be packed much 
more closely than ordinary nuclei and electrons, the “gravitational packing” energy in 
a cold neutron star may become very large, and under certain circumstances, may far 
exceed the ordinary nuclear packing fractions. A neutron star would therefore represent 
the most stable configuration of matter as such. The consequences of this hypothesis will 
be developed in another place, where also will be mentioned some observations that tend 
to support the idea of stellar bodies made up mainly of neutrons.’ 


It is best to allow Zwicky to describe how these ideas were received in a quotation from the 
extraordinary preface to his Catalogue of Selected Compact Galaxies and of Post-Eruptive 
Galaxies of 1968 (Zwicky, 1968). 


‘In the Los Angeles Times of January 19, 1934, there appeared an insert in one of the 
comic strips, entitled “Be Scientific with Ol’Doc Dabble” quoting me as having stated 
“Cosmic rays are caused by exploding stars which burn with a fire equal to 100 million 
suns and then shrivel from 5 million miles diameter to little spheres 14 miles thick”, Says 
Prof. Fritz Zwicky, Swiss Physicist. This, in all modesty, I claim to be one of the most 
concise triple predictions ever made in science. More than 30 years were to pass before 
this statement was proved to be true in every respect.’ 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:56:00 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.019 
Cambridge Books Online © Cambridge University Press, 2014 





387 


18.8 Astrophysical applications 


In the meantime, Gamow showed in 1937 that a gas of neutrons could be compressed 
to a much higher density than a gas of nuclei and electrons and estimated the probable 
densities of such a star to be about 10!7 kg m”? (Gamow, 1937, 1939). The issue of the 
maximum mass of neutron stars was discussed by Landau in 1938 (Landau, 1938) and in 
much greater detail by Oppenheimer, Robert Serber and George Volkoff (Oppenheimer and 
Serber, 1938; Oppenheimer and Volkoff, 1939). The physics is the same as in the case of the 
white dwarfs but now neutron degeneracy pressure holds up the star. Complications arise 
because it is necessary to take into account the details of the equation of state of neutron 
matter at nuclear densities and the effects of general relativity can no longer be neglected. 
They found an upper mass limit of about 0.7 Mo. This result is not so different from the 
best modern estimates which correspond to about 2-3 Mo. 

This work created some theoretical interest but little enthusiasm from the observers. The 
radii of typical neutron stars were expected to be about 10 km and so there was no prospect 
of detecting significant fluxes of thermal radiation from such tiny stars. The idea of Baade 
and Zwicky that a neutron star might be the remnant left behind after a supernova explosion 
was proved correct 33 years later with the discovery of pulsars by Antony Hewish, Jocelyn 
Bell(-Burnell) and their colleagues in 1987 (Hewish et al., 1968). 
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Epilogue 





As expected when I started out on this project, this has proved to be a complex and, 
at times, difficult story. After all, what was involved was tearing up the foundations of 
classical physics, which had been extraordinarily successful in explaining the macroscopic 
world about us, and replacing it by something radically different and non-intuitive in terms 
of our everyday experience. But the effort involved has been more than repaid by the very 
much deeper appreciation I have gained of the extraordinary works of the pioneers of 
quantum mechanics, both the theorists and the experimenters. If the brilliant theoretical 
researches of Planck, Einstein, Bohr, Heisenberg, Born, Jordan, Schrédinger, Pauli, Dirac 
and many others form the central core of this story, it should be remembered that their 
researches were inspired by the equally brilliant achievements of experimental physics. 
Another huge bonus has been a deepened understanding of quantum mechanics itself — if 
only I had these insights more than 50 years ago when I first encountered the subject. 

There is a great deal more that could be said. I must reiterate that I have presented a 
somewhat streamlined version of the story in order to ensure that there is some continuous 
pathway, however tortuous, to the way in which the new understandings came about. For a 
full appreciation of the complexity of the story and the numerous blind alleys and diversions 
which took place, there is no substitute for in-depth absorption in Mehra and Rechenberg’s 
magisterial exposition of the history of quantum theory. Likewise, Jammer’s wonderful 
survey of the conceptual and philosophical development of the subject is indispensable. 
I make no claim even to begin to approach the depth and thoroughness of these exposi- 
tions. Building on these authors’ achievements, my much more modest aim has been to 
demonstrate at an accessible level exactly what the great pioneers actually did and to enable 
readers to reconstruct for themselves the extraordinary efforts of the imagination involved 
in coming to these new understandings. In Chandrasekhar’s words, the appreciation of the 
subject cannot be achieved with ‘modest effort’. 

The story has an immediacy which corresponds exactly with my own experience as a re- 
search scientist. It reveals how experimental and theoretical physicists go about discovering 
new laws of nature and the way in which these then shape our world. I have concentrated 
upon the intellectual, theoretical and experimental side of the story, but there is an equally 
absorbing human story which is the subject of the numerous excellent biographies of the 
protagonists of this story. My admiration for all the founders of quantum mechanics has 
increased enormously through the detailed study of their original works. I personally find 
it a dramatic and awe-inspiring adventure which changed forever our understanding of the 
nature of the physical world we live in. 
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Chapter 1 


. Pippard’s review provides an illuminating account of the profession of the experimental and 


theoretical physicist in 1900 (Pippard, 1995). Equally important are the statistics in the essay by 
Heilbron (1977) concerning the central role of European physicists, particularly those in Germany, 
at the turn of the century. 


. Many more details of these issues and problems are contained in the various case studies described 


in the second edition of my book Theoretical Concepts in Physics (TCP2) (Longair, 2003). The 
various topics will only be summarised here with references to the relevant sections of TCP2. 


. I have enjoyed revisiting my old undergraduate chemistry textbook General and Inorganic Chem- 


istry by P. J. Durrant (1952). I have used the formulation of the laws as expressed by Durrant. 


. Inthe modern SI system, the carbon-12 atom is used as the standard for atomic weights. One mole 


is defined as the mass of a substance which contains the same number of chemical units (atoms or 
molecules) as exactly 12 grams of carbon-12. For example, since the molecular weight of oxygen 
is 31.9988, one mole of oxygen has mass 31.9988 gram. 


. See TCP2, pp. 257-263. The outline of Maxwell’s derivation is as follows. The total number of 


molecules is N and the x, y and z components of their velocities are v,, v, and v,. Maxwell 
supposed that the velocity distribution in the three orthogonal directions must be the same after a 
great number of collisions, that is, 


Nf (vx) du, = Nf (vy) dv, = Nf (vz) dv;, 


where f is the same function. The three perpendicular components of the velocity are entirely 
independent and hence the number of molecules with velocities in the range vy to vx + dux, v, to 
vy + dv, and v, to v; + dv; is 


Nf (0x) fy) f(vz) dv, dv, dv; 


But, the total velocity of any molecule v is v? = v? + v? + v?. Because large numbers of collisions 
have taken place, the probability distribution for the total velocity v, &(v), must be isotropic and 
depend only on v, that is, 


SODS (v2) = $v) = Oly + vy + WE). () 


where we have normalised the function (v) so that 


i T T o(v) dv, dv, dv, = 1. 


Equation (1) is a functional equation. By inspection, a suitable solution is 
fx=Ce™, =t fü) =Ce**. 


where C and A are constants. The distribution must converge as v — oo and hence A must be 
negative. After normalisation (1.1) is obtained. 


. TCP2 Sect. 10.5. 
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13. 


14. 


15. 


16. 


17. 


18. 


w 


Notes to pp. 10-35 


. See TCP2, p. 266, where the origin of the term ‘Maxwell’s demon’ is explained. 
. TCP2, Chap. 4. 
. For the sake of clarity of exposition, I adopt SI units throughout this text. In Maxwell’s great 


papers of 1861 and 1865, the equations were written out in a somewhat lengthier form than 
(1.5)-(1.8) and involved seven equations. Maxwell’s equations were reorganised into their more 
familiar form by Heaviside, Helmholtz and Hertz in the subsequent years. 


. TCP2, pp. 289-297 
11. 
12. 


TCP2, pp. 400-403. 
The reasons for the neglect of Voigt’s analysis are discussed by Ernst and Hsu (2001) who point 
out that the general scale-invariant transformations are 


; Vx 
t = Ky t——> š 
Cc V2 


-1/2 
x’ =Ky(x — ct), y= (1- =) ; (2) 
yey, c 

zZ = KZ, 


Voigt set the constant « = y~!. Much later, long after the papers by Lorentz and Einstein, Lorentz, 
Weichert, Minkowski, Born and Sommerfeld all acknowledged that Voigt had found the form of 
the Lorentz transforms in 1887. Not only is the equation scale invariant, it is also conformally 
invariant. 

This is demonstrated in my book High Energy Astrophysics, third edition, pp. 149-151 and 
Fig. 5.4. 

TCP2, pp. 404-405. 

TCP2, Sect. 11.2. 

Balmer’s paper of 1885 published in the Annalen der Physik und Chemie was a synthesis of two 
papers originally published in the Verhandlungen der Naturforschenden Gesellschaft in Basel 7, 
548-560 and 750-752. 

TCP2, Sect. A9.2. Maxwell’s relations provide partial differential relations between the four 
thermodynamic coordinates p, V, S and T. 

This result is derived in TCP2, pp. 297-300. 


Chapter 2 


. In his excellent book Inward Bound, Pais (1985) provides a detailed account of the development 


of experimental physics and its discoveries through the whole period covered by this book. 


. As noted in Sect. 1.7.2, at the microscopic level, the system comes into thermal equilibrium, 


whatever the properties of the walls of the enclosure as a result of the principle of detailed 
balance. 


. Quoted by M. J. Klein (1967) p. 3. 
. The argument also works for perfectly absorbing walls since, according to Kirchhoff’s theorem, a 


perfect absorber is also a perfect emitter of radiation. Planck uses the case of perfectly reflecting 
walls so that the argument is purely electrodynamic. 


. Notice that this intensity is the total power per unit area from Ar steradians. The usual definition 


of intensity is in terms of W m~? Hz! sr=!. Correspondingly, in (2.19), the relation between 
I(@) and u(w) is I(w) = u(w)c rather than the usual relation /(w) = u(w)c/Ar. 


. See note 5 above about the relation between intensity and energy density. 
. [have given a detailed analysis of Rayleigh’s paper in TCP2, Sect. 12.6 and only summarise the 


key results here. 
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8. 


13. 


AUN 


Notes to pp. 39-76 


See TCP2, Sect. 12.4. Wien’s law (2.29) can be written 
8 
u(v) = ven X 
c 
Combining this with the relation (2.26) between u(v) and E, u(v) = (8m v7/c3)E, 
E = ave ™™T , 


The thermodynamic relation between the entropy S, internal energy U, the pressure p and the 
volume V is 


TdS=dU+pW. 


Therefore, placing the oscillator in a fixed volume V, 


as 1 
CS: g 


U and S are additive functions of state and hence the above relation refers to the properties of an 
individual oscillator as well as to an ensemble of them. Therefore, 





1 as 1 E 
T dE), Bv av 
Integrating with respect to E, we obtain Planck’s definition of the entropy of the oscillator 
E E 
S=-— In —. (5) 
v ave 


. See TCP2, Sect. 12.5. 
. See TCP2, Sect. 13.3. 
11. 
12; 


See TCP2, Sect. 10.7. 

The simplest way of deriving this relation is to run ahead of our story and consider the radiation 
within the enclosure to consist of particles travelling in random directions at the speed of light. 
Then, according to classical kinetic theory, the number arriving at unit area of the enclosure per 
unit time is Inc. The total energy of particles arriving at unit area is therefore ince = tuc. In 
thermodynamic equilibrium, this must also equal the radiated energy per unit surface area. 


See Klein (1967), p. 17. 


Chapter 3 


. These papers are conveniently available in English translations in Einsteins Miraculous Year, 


edited and introduced by John Stachel (1998). The translations are taken from The Collected 
Papers of Albert Einstein: Vol. 2. The Swiss Years: Writings, 1900-1909 (Stachel and Cassidy, 
1999), which can also be strongly recommended. 


. See TCP2, Sect. 14.4. 


Chapter 4 


. More details of the contributions of Zeeman and Lorentz are given by Kox (1997). 

. See the rules for electric dipole radiation in Sect. 2.3.1. 

. See Sect. 9.2.1 of High Energy Astrophysics (Longair, 2011). 

. Thomson’s formula is similar to that derived for the interaction of cosmic ray electrons with 


thermal electrons. The formula is derived in Sect. 6.1 of High Energy Astrophysics (Longair, 
2011). 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:56:25 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.021 
Cambridge Books Online © Cambridge University Press, 2014 





392 


U 


Notes to pp. 78—148 


. A derivation of the formulae for Rutherford scattering is contained in the Appendix to Chapter 2 


of TCP2 (Longair, 2003). 


Chapter 5 


. Different authors user different notation for the quantum numbers. For consistency, I will endeavour 


to use a consistent notation throughout, but it will be different from those appearing in the primary 
literature. As we will find, the quantum numbers have slightly different meanings in the final 
version of quantum mechanics and so using ng, n, and ng in the old quantum theory should not 
cause problems and should clarify the differences between the old and new theories. 


. [have preserved the italic and bold fonts of the original text. 
. See, for example, Goldstein (1950). 
. There are four transformations between the p;, g;, P; and Q;. Two of them are given in (5.75) and 


(5.76). The other two are: 














as as as 
pee, Bes and K=H+—, (6) 
op; 90; ot 
and 
as as as 
G@=-—, Q= wih K=H+—. o) 
dp; aP, at 


. This follows the notation of Goldstein (1950) who uses S in the time-dependent version of the 


Hamilton-Jacobi equations and W in the time-independent version. W is referred to as Hamilton's 
characteristic function. 


. In the quantum mechanics of Heisenberg and Schrödinger, the principal quantum number is 


denoted by n, rather than (n, + nọ +m). As will become apparent, the significance of n in 
defining the stationary states in the final theory of quantum mechanics is different from that in the 
Bohr-Sommerfeld picture. The other quantum numbers have corresponding equivalences which 
will be described in due course. 


Chapter 6 


. The term ‘photon’ meaning quantum of light was introduced by Gilbert N. Lewis in 1926 (Lewis, 


1926). 


. See, for example, High Energy Astrophysics, Sect. 9.2.2 (Longair, 2011). 
. Note that, in Bohr’s usage, w is the frequency of orbital motion, not the angular frequency. This 


usage is adopted in (6.25) and (6.26). 


. Note that the quantum defects in (1.18) and (1.19)-(1.21) are written with a plus sign. The 


difference between these and (6.63) is explained by adopting different initial values for the 
principal quantum number n. 


Chapter 7 


. Epstein was a pupil of Sommerfeld’s and, because he was a Polish citizen, he was interned as 


an enemy alien in Munich, during which period of internment, he carried out these pioneering 
calculations. 


. Schwarzschild’s paper was published on the day he died, 11 May 1916, from the illness pemphigus 


contracted while he was on military service on the eastern front in Russia. 


. The details of the development of these ideas are described by Mehra and Rechenberg (Mehra and 


Rechenberg, 1982a) (Vol. 1, Sect. IV.4) and summarised by Jammer (Jammer, 1989). 
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Notes to pp. 155-201 


Chapter 8 


. Details of Niels Bohr’s life and times are contained in the books Niels Bohr: A Centenary Volume 


edited by A. P. French and P. S. Kennedy (1985) and Niels Bohr’ Times, in Physics, Philosophy 
and Polity by A. Pais (1991) 


. See Mehra and Rechenberg (1982a), 1, 230. 
. The Wolfskehl Prize was awarded to Andrew Wiles on 28 June 1997 almost a century after 


Wolfskehl’s death. Wiles completed his proof in 1995 but it took a further two years of expert 
analysis to confirm that the theorem had at last been proved. 


. A priority dispute ensued between the French and Danish physicists about the discovery of hafnium. 


This caused Bohr considerable distress (see, for example, the discussion by Kragh (1985)). 


. The history of X-ray spectroscopy is presented in the book Fifty Years of X-ray Diffraction (ed. 


P. P. Ewald) (1962). This book is available on-line in pdf form from the International Union of 
Crystallography at http://www.iucr.org/publ/50yearsofxraydiffraction. 


. Van der Waerden has made a careful study of why Pauli did not press ahead and propose the 


spin and magnetic moment of the electron when he was so very close to their discovery (van der 
Waerden, 1960). 


. More details of this episode and its context are given in Sect. 16.1. 


Chapter 9 


. See Mehra and Rechenberg (1982a), p. 511. 
. These results are derived in Chapter 9 of my book High Energy Astrophysics (Longair, 2011). 
. The differences were to find an explanation in quantum mechanics and are associated with the 


symmetries of the wavefunctions for particles of different spins (see Chap. 16 and Sect. 18.1). 


. An excellent concise treatment of the derivation of the Bose—Einstein distribution is given by 


Huang in his book Introduction to Statistical Physics (Huang, 2001). 


. Interms of quantum mechanics, the explanation for this distinction is neatly summarised by Huang 


who remarks: “The classical way of counting in effect accepts all wave functions regardless of their 
symmetry properties under the interchange of coordinates. The set of acceptable wave functions is 
far greater than the union of the two quantum cases [the Fermi-Dirac and Bose-Einstein cases]. 


. In his paper, de Broglie uses the principle of least action in the form of Maupertuis’ principle 


which can be applied in the case of a particle moving in a stationary potential (de Broglie, 1924b). 
This is discussed in more detail in Sect. 14.5.1. 


. For a demonstration of this result, see Theoretical Concepts in Physics, Sect. 7.3 (Longair, 2003). 


Chapter 10 


. The term ‘quantum mechanics’ first appeared in the paper by Born (1924) which is discussed later 


in this chapter. Jammer (1989) discusses the origin of this term and its significance. 


. See, for example, Sect. 5.3.2 of Theoretical Concepts in Physics (Longair, 2003). 
. This result is demonstrated in, for example, Sect. 6.11 of Theoretical Concepts in Physics (Longair, 


2003). 


. Mehra and Rechenberg (1982a) provide a detailed account of the history of appointments to 


the chairs of mathematics and physics at Göttingen (see pp. 262-313). Their discussion tracks 
the various routes through the academic profession from doctoral student, to Habilitation, to 
Privatdozent, to extraordinary professor and finally to the most prestigious position as ordinary 
professor. Because of a prohibition on appointing professorships by internal promotion from 
Privatdozent, the winning of a professorship normally meant moving from one University to 
another. This accounts for the somewhat complex movements of physicists and mathematicians 
as they made their way up through the university profession. 


. Mehra and Rechenberg, op. cit., Vol. 1, Part 1, p.275. 
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Notes to pp. 207-59 


Chapter 11 


1. Fermi’s problems with Heisenberg’s paper are described by van der Waerden (1967). Weinberg’s 


concerns are expressed in his book Dreams of a Final Theory (Weinberg, 1992). He writes 


‘If the reader is mystified at what Heisenberg was doing, he or she is not alone. I have tried several times 
to read the paper that Heisenberg wrote on returning from Heligoland, and, although I think I understand 
quantum mechanics, I have never understood Heisenberg’s motivation for the mathematical steps in 
his paper. Theoretical physicists in their most successful work tend to play one of two roles: they are 
either sages or magicians .. . It is usually not difficult to understand the papers of the sage-physicists, 
but the papers of the magician-physicists are often incomprehensible. In that sense, Heisenberg’s 1925 
paper was pure magic.” 


. The issues of the notation used by Heisenberg are discussed by van der Waerden (1967) who 


shows how they are related to the notation used in the paper by Kramers and Heisenberg (1925) 
and to his correspondence with Kronig and Pauli. 


. We obtain exact agreement if we set |xo| = |xo|/2 (see Mehra and Rechenberg (1982b), p. 301). 


The justification for this step is given in Sect. 11.5, equation (11.70). 


. As pointed out by Aitchison et al. (2004), Heisenberg confusingly uses the terminology a(n, n — T) 


in both (11.40) and (11.70) without explaining why the factor of 4 disappears from the latter 
expression. 


. The expression (11.104) is identical to the result found in the quantum theory of rotation, the n 


being replaced by the angular momentum quantum number j. Therefore, 
2 


W = 
8221 





LU+)+5]- (8) 


Chapter 12 


. I enjoyed revisiting the elementary text book by A.C. Aitken Determinants and Matrices (1959) 


from which I learned matrix algebra. 


. The term /ibration means periodic perturbations to the elliptical orbits of planets in celestial 


mechanics and of electrons according to the old quantum theory. 


. [have given an elementary proof of this relation in Appendix A4 of my text Theoretical Concepts 


in Physics (Longair, 2003). 


. In Sect. IV.5 of Volume 3 of their series, The Historical Development of Quantum Theory. The 


Formulation of Matrix Mechanics and its Modifications 1925-1926, Mehra and Rechenberg 
(1982c) provide a clear summary of the contents of Pauli’s important paper. 


Chapter 13 


. Poisson’s theorem is the statement that, for arbitrary functions F}, F» and F3, 


(LF. Pl 5] + [[F2, Bl, Fi] + (Fs. il, PRl=0, (9) 


where the square brackets represent Poisson brackets. 


. I use the convention that Poisson brackets are contained within square brackets. In the literature, 


they are often enclosed in round brackets. 


. In Dirac’s paper, he uses h to mean what we would now calli = h/2rr. Likewise, when he uses the 


word frequency, he means what we would call angular frequency w. I have made these translations 
in the usage employed throughout this book and also converted to SI units. 
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Notes to pp. 259-64 


4. Pauli’s paper was received by Zeitschrift fiir Physik on 17 January 1926 and published on 


27 March 1926. Dirac’s paper was received by Philosophical Transactions of the Royal Soci- 
ety on 22 January 1926 and published on 1 March 1926. 


Chapter 14 


. These papers and others were published in English in the book Collected Papers on Wave Mechanics 


(Schrédinger, 1928). 


. The origin of the Bose-Einstein condensation can be understood from the following arguments. 


The Bose-Einstein distribution can be written 


Me nz (10) 
where the constants œ and £ are to be determined. On thermodynamics grounds, 6 = 1/kT and 
a is determined by fixing the number of photons, atoms or molecules. For black-body radiation, 
the number of photons is unconstrained and so œ = 0. The number of photons is matched to the 
total energy in the Planck distribution — all the properties of the radiation are determined by the 
temperature T alone. 

In the case of the atoms of a monatomic gas, as shown in Sect. 14.2, gą = 4mp? dp and 


E = p?/2m. Therefore, converting to a continuous energy distribution, 


&% 4r V (2m? E)! 


E)dE = — = 
NEE gr) (Ber 1) 





(11) 


where B = e“. The constant B cannot be less than 1 or else the number density of particles could 
be negative. Therefore, B > 1, a > 0. This expression can be rewritten as follows, 


V (2m\3?—— E! 
N(E)dE = = dE, 12 
( )d Ar? (> ) (BeE/kt _ 1) ( ) 





where A = h/2z. Therefore, the total number of particles is given by 
2 3/2 poo ER V 2mkT N? pe 1/2 d 
ve (2 f dE = 2 Í A 
4n? (R? o  (Be#/kt — 1) An? h? o (Be -— 1) 


mkT N?’ 
nv (m) F(B), (13) 





where y = E/kT and 
2 oo yl? dy 


"O= Teh Be 


(14) 


A plot of F(B) as a function of B is shown in Fig. 1. For large values of B, the function F(B) 
tends to 1/B and the particle distribution to the Boltzmann distribution. The figure also shows 
that F(B) tends to a finite limit as B — 1. For B = 1, the integral for F(B) becomes 





F(B) = = A = ¢(3/2) = 2.612, (15) 
where £ is the zeta function. There is therefore a critical temperature Tg at which F(B) reaches 
this finite value given by 

anh? ( N N” 
a (ur) i us 
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Notes to pp. 265-72 














3 T T T T 
2.612 fovrercrcrrcrreefereeerneeeceeceeeetneeeteeetneetaneeteeeenaestneeeeserneetaneeneeeneesnense 
2} 
T 
X 
14 
0 
0 





The function F(B) as a function of B. 


Einstein appreciated that this result led to a strange behaviour of the particle distribution at 
temperatures below Tg. According to (11), if N is fixed, we would expect that, as the temperature 
decreases, F(B) would increase, but Bose-Einstein statistics show that this increase does not 
continue below Tg. If F(B) reaches the limiting value of 2.612, the number of particles N must 
decrease at temperatures lower than 77. 

Einstein realised that what had gone wrong was that the continuum approximation whereby we 
replaced g; by N(E)dE does not take account of the fact that the quantised zero energy state can 
contain a finite number of particles. The solution was that, at temperatures below Tg, the particles 
accumulate in the zero energy state and that at low enough temperatures, all the particles would 
be condensed into that state. This was the discovery of the Bose-Einstein condensation. Although 
it was not realised at the time, this was the first demonstration of a phase transition using the 
techniques of statistical mechanics of identical particles. 


. In 1927, Willem Keesom and Mieczyslaw Wolfke discovered the strange variation of the specific 


heat capacity of helium at the lambda point at 2.19 K which was identified as a phase transition 
between the phases known as He ı and He ıı (Keesom and Wolfke, 1928). In 1938, Fritz London 
proposed interpreting this phase transition as the onset of the Bose—Einstein condensation (London, 
1938). Taking the number density of liquid helium at the A-point to be 2.18 x 10° m, 73 = 3.1 
K. There were however concerns about this identification since the expression (14) for 75 was 
derived for an ideal gas, whereas the phase transition is observed in liquid helium. 


. Throughout this chapter, I have converted Schrédinger’s notation of quantum numbers to modern 


usage. Thus, in Schrédinger’s notation, the angular quantum number / is written as n. 


. We need to evaluate 








2 2 2 2 
ssa fff ae (2) + (M4) + (MY (ee ZZ) ] =o on 


where the integral is taken over all space. In Sect. 5.4.2, we derived the procedures for finding the 
stationary values of the Lagrangian £ with respect to time. We can reformulate these procedures 
for the present problem for a single variable q; by replacing L(ġ;, qi, t) by J(w. W, qi), n by 
dw and carrying out the variation with respect to spatial coordinate g;. Making this replacement 
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Note to p. 272 


in (5.57), 

22 oJ ; oJ 
—— dw + — ô dg; , 18 
Em or v] l en 


where y(q;) means dr /dg;. As in Sect. 5.4.2, we can set the first integral equal to Sp and integrate 
the second term by parts. Then, 


av 1i) wr d av av 
S = S% + | —— ô 7 | ( - a 8 Ja u 19 
° 65 |, (a) ar“ a3) 


In the case discussed in Sect. 5.4.2, the first term in square brackets disappears because the 
endpoints were assumed to be fixed. In the case of the hydrogen atom, this becomes a condition 
on the convergence properties of the integral as r —> oo. Let us continue with the integration in 
the x-direction. The first term of (19) can be evaluated in the x-direction from the terms inside the 
triple integral in (17) as follows: 


elf ee om 


where dn is the differential element of distance perpendicular to the element of area dA. Extending 
this calculation to three dimensions, this term becomes 


26 aa £) ôy. (21) 


Extending the last integral of (19) to three dimensions, 


0 ðJ 0 ðJ 0 oJ ðJ 
IÍ RUN E eS ty (5) ta a) z] > Ea 


Therefore the stationary function w(x, y,z) is found from the function which minimises the 
expression 


15g = 24 
jas = Pau (37) ov 


af as af as a ( ƏT ðJ 
+ [J] away È Go ay (am taz (36) | Oe 


The two terms must be separately zero, the first being the integral over a sphere as r — oo and 
the second must be true for all small changes ôy about the stationary solution. Therefore, these 
conditions reduce to the requirements 


u (SA )+2(35)+2 (5) (24) 
ay ax ýx) dy \av(y) az Iý) 


faa (2) ôy =0. (25) 


Inserting the expression for J from (17) into (24), we immediately find 


S= f Jý (ai), va), qil dq: +f 


qı qı 























and 








9 a? 9? 2m e? 
a ms ba K? (£ ne aa) pon ee) 
or 
y+ 2 (E+ a )v=0. (27) 
K? 4T €or 
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Notes to pp. 272-76 


The expressions (25) and (27) are those derived by Schrödinger in his first paper on wave mechanics 
(Schrédinger, 1926b). 


. In Schrédinger’s paper, he uses the wave equation in the form (14.37) until he derived the energy 


levels of the hydrogen atom and then infers K = h/2z. He certainly knew this relation had to be 
true from the outset of the paper. 


. A simple demonstration of the origin of Maupertuis’ principle in one dimension can be obtained 


from the Euler-Lagrange equation (5.59) which is derived by precisely the same procedure for 
finding the stationary function between two fixed points. Writing 27 for £ in (5.59) and considering 


only motion in the x-direction, 
oT d (oT 
er of Ves i). 28 

əx dt E ) 8) 


In a conservative field of force, T = tmx? = E — U(x) and E is the constant total energy of the 
particle. Inserting these relations into (14.44), we find immediately 


d’x dU(x) 
m— =— 
dr? dx 


which is Newton’s law of motion and hence the proof of Maupertuis’ principle. 





= fr. (29) 


. Hamilton’s action function is defined to be S = In L dt, where £L = (T — U)isthe Lagrangian, and 


defines an action surface S(x, y, z, t) = constant. The properties of the action surface are directly 
related to the dynamics of the particles of the system. These can be most simply appreciated from 
the exposition by Landau and Lifshitz (1976). Using their notation, the expression for 5S (5.58), 
can be rewritten 





aL , 1? 2fd (ac aL 
sso Peal) aa co 
0g; 4 n Ldt \9g ðq 
Because of Lagrange’s equations of motion (5.59), the integral on the right-hand side of (22) is 
zero and so 
az. |? 
5S = | ay] . G1) 
qi m 


We now set ôq (t1) = 0 and let ôq (t2) = ôq. Then, since 0£/0q; = p, we find 
5S = pdq, (32) 


or, summing over all the degrees of freedom, 


ôS = D1 pi dai - (33) 
It follows that 
ôS 
pas, (34) 
di 


Thus, the components of the momentum are the gradients of the action surface $ in different 
directions. In general we can write 


p-VS. (35) 


From the definition of the action given at the beginning of this endnote, the total time derivative 
is 
ds 


r~ (36) 
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10. 


Notes to pp. 276-83 


We can now relate the total derivative of S to the partial derivatives with respect to time and the 
coordinates q; by the relations 








ds as aS aS 
= ji = Gi =L, 37 
dt ar + Li agi! ar + Pd en 
from (34). Hence, 
as : 
y TET Piti, (38) 
or, 
aS 
— =-H. 39 
Fr (39) 
Rewriting this expression in the notation used in (14.48) for a single particle, we find 
as 
Fr pv (40) 


Equations (35) and (40) are those used by Hamilton in comparing particle dynamics and the paths 
of light rays. 


. The simple way of deriving this result, with a great deal of hindsight, is to recognise that the 


quantisation of angular momentum results in the expression J? = /(/ + 1)h?. Using the fact that 
the expression for the energy of a stationary state is the same in quantum mechanics as in classical 
mechanics, 


J I+D I+D 
21 2 BT 
In the mathematical description of the propagation of a wave-packet in a dispersive medium, the 
pulse shape A(x) is modulated by a ‘carrier wave’ with wavevector K. The profile of the wave- 
packet is therefore f(x, t = 0) = A(x) e'X*. Taking the Fourier transform of the wave-packet, we 
can write 


E= 





(41) 


A(x) er = (f. B(q) e* aa) en. (42) 


If the dispersion relation for the waves is w(k), it is a straightforward calculation to show that, if 
q X K, ata later time ¢ > 0, the wave-packet has the form 


oo 
| f Ba) expa o"t) +) J en (43) 
at a Te 
where w = dw/dk is the group velocity and œ” = d’w/dk? is the second derivative of the angular 

frequency with respect to wavenumber k. 

Let us consider the representation of the motion of a uniformly moving particle by a wave- 
packet. The envelope of the wave-packet is modulated by the carrier frequency which moves at 
the phase velocity vp, = w/k in the positive x-direction, term (i) in (43). Term (ii) in (43) shows 
that all the Fourier components of the waveform move at the group velocity, vg, = dw/dk in the 
positive x-direction. For a free particle, it is shown in endnote 11 that the dispersion relation 
is œ = hk? /2m and so vgy = dw/dk = hk/m = p/m = v, where v is the constant speed of the 
particle. Thus, the wave-packet propagates with the speed of the particle it is to represent. 

The term (iii) represents the dispersion of the wave because generally, in classical physics, the 
group velocity is slightly different at the leading and trailing edges of the wave-packet. For a 
particle moving at constant speed according to de Broglie’s hypothesis, we have shown, however, 
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11. 


Notes to pp. 283-307 


that dw/dk = v = constant and so d’w/dk? = 0. In other words, there is no dispersion of the 
wave-packet which represents a particle moving at constant speed in the x-direction. 
De Broglie postulated that the wavevector k associated with the momentum of a particle should 
be p =hk and that the relation between the energy of the particle E and angular frequency œw 
should be similar to that for photons, E = hw. For a free particle, the energy is the kinetic energy 
E= imo? = p° /2m. Therefore, 
272 
(Soe Se or sa, 
2m 2m 2m 
This dispersion relationship must be the solution of our quantum mechanical wave equation for 
a free particle. The expression w = hk? /2m is significantly different from the dispersion relation 
for light waves, œ = ck. However, the group velocity, v = dw/dk, is 
do Ak p 
„were = Sp, 
dk m m 
the speed of the particle, as discussed in Sect. 14.5.1. 
Suppose the wavefunction for a free particle is a sine wave, w(x, t) = A sin(kx — ot). Taking 
the second derivative of y with respect to x, the expression for sin(kx — ot) is multiplied by k?: 


(44) 








FY, t l 
TY) Ae stent, (45) 
ax? 
To obtain a factor containing w, we take the first derivative with respect to t 
a 
me) = —wA cos(kx — at). (46) 


We cannot find an equation involving a second derivative with respect to x and a first derivative 
with respect to ¢ if the solution is to be simply a sine wave of the form sin(kx — wt). If instead 
the solution for a free particle is to be a complex wave of the form y(x, t) = A exp[i(kx — of)], 
then, taking partial derivatives with respect to x twice and once with respect to t, we find 








ayy 2 4 .i(kx—ot) pP ow i i(kx—ot) iE 
ge Ae "= a De iwAe i Se (47) 
We can also write Ey = p*/2my and so 
h ew av aw 
= ih , Ew=ih—. 48 
2m ax2 at vain, = 


These relations are of exactly the same form as (14.102) for the time dependence of the wave- 
function. 


Chapter 15 


. Luse the term Heisenberg approach to refer to matrix mechanics as developed by Born, Heisen- 


berg and Jordan in their seminal papers of 1925 and 1926 (Born and Jordan, 1925b; Born er al., 
1926). 


. Note that Schrödinger does not use a consistent set of suffices for the matrix elements used in 


this development. I have largely maintained the usage in Schrédinger’s paper. 


. This remark was made as part of a recorded interview with Born on 17 October 1962. The 


interview is preserved in the Archive for the History of Quantum Physics (see Jammer (1989)). 


. This letter was translated by van der Waerden and appears with a commentary and analysis in 


his paper From matrix mechanics and wave mechanics to unified quantum mechanics (van der 
Waerden, 1973). 


. The method had been previously known to mathematical physicists, prominent among them 


being Harold Jeffreys. For this reason, a ‘J’ is often added to WKB and the order of the initials 
may vary from country to country. 
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Chapter 16 


1. See Mehra and Rechenberg (1987), p. 736. 
2. [have given this simple physical derivation in Sect. 13.2.1 of High Energy Astrophysics (Longair, 


2011). 


3. Note that in his papers Dirac takes the Planck’s constant to be what is now designated h. I have 


translated his notation into standard contemporary usage. 


4. Dirac’s remark was made as part of a recorded interview carried out on 7 May 1963. The interview 


is preserved in the Archive for the History of Quantum Physics. See also, Mehra and Rechenberg 
(1987), p. 767. 


5. In his paper, Pauli (1927b) writes the non-relativistic Schrödinger equation as a pair of equations, 


one for each spin state 


h ð 
H =, Gk, Ox, Oy, Oz WE a = E Ya , 
Imi dqr 


(49) 


h oa 
H s dk, Ox, » Oz =E ’ 
( Ir, Ox, Oy o) ves Wp 


the Ya and wz corresponding to our yw, and w_. 


. Specifically, on page 69 of Baker’s Principles of Geometry (1922), a set of four 2 x 2 matrices is 


given as an example which could form the basis of a self-consistent non-commutative geometry. 
To quote Baker, “Using 0, 1, i, .. . let us consider then the four symbols 


1 0 0 -1 0 i -i 0 
ek 1? ral ue i=l g ral AE (50) 


it is then easy to compute that 





JK=-KJ=I,KI=-IK=J, IJ=-JI=K, (51) 
P= P=Hk*=-U? (52) 


Evidently, Dirac had to find corresponding matrices which could represent the three orthogonal 
spin orientations, o,, o, and oz. 


. I am using the conventions used by Rindler (2001) in which the Lorentz transformations are 


written: 


ct! = y(ct — Vx/c), 
x’ =y(x-VJVt), (53) 
y =y, 


zZ =z, 


where y = (1 — v?/c?)-"/? is the Lorentz factor. The prototype displacement four-vector is R = 


[ct, x, y, z] and the norm of the displacement four-vector is norm(R) = et? — x? — y? — 2°. 
. The forms of the 4 x 4 œ matrices can be readily derived from Dirac’s paper (Dirac, 1928a). They 
are: 
000 1 0 0 0 -i 
u) 0 1 ] | 0 i | 
i 0100 á 0 -i 0 0 |?’ 
i 0 0 | i; 0 0 0 | 
0 0 1 0 100 0 
ae 00 0 -i ee 010 0 
z 10 00 ; 0 0 -1 0 
0 -1 0 0 0 0 0 —1 
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The spin matrices introduced in (16.91) immediately follow from the above definitions, 


0; = iQ, Ay 


0 
0 
0 
1 


ooor 
o.oo 


0 
1 
0 
0 


o 
| 
— 
o.oo 


9. More details of the history of cosmic rays can be found in my book The Cosmic Century, Sect. 7.2 


(Longair, 2006). 


Chapter 17 


1. This topic is dealt with in much more detail in Sect. 15.3 of my book Theoretical Concepts in 


Physics (Longair, 2003). 


2. See the discussion by Jammer (1989), pp. 328-329. 
3. The essence of Heisenberg’s analysis can be appreciated using Schrédinger’s representation of a 


particle by a wave-packet and then using de Broglie’s relations between momentum and wavelength. 
A localised wave-packet can be represented by a superposition of an infinite number of waves. 
Following Jordan and Heisenberg, let us choose a continuous distribution of waves which has a 
Gaussian distribution of amplitudes about some central wavenumber k’, 


(k -K'Y ] 


(54) 
2 


A(k) = exp |- 
kı is the standard deviation of the amplitudes of the waves of different values of k about the mean 
value k’. When we dealt with discrete values of k,,, we described the function as a summation of 
sine or cosine waves. Let us consider the cosine waves for convenience, 


W(x) = Yo An(ky) COS kax . (55) 
Converting this into a continuous distribution of wavenumbers, y becomes 
+00 
W(x) = 1 A(k) cos kx dk. (56) 
We carry out this integral for the above Gaussian distribution A(k). The integral becomes 
me k—ky 
Í. exp | coskx dk. (57) 


We write coskx = N(e*) where the symbol Ñ means ‘take the real part of’. Therefore, the 


integral becomes 
(oe) k—k 2 
R f exp ur +ikx | dk. (58) 
N 2k; 


Note that this expression has exactly the same form as (17.49) and explains the origin of the 
complex terms in Jordan’s expression for the probability amplitude. Now, let us deal with the 
expression inside the exponential. To integrate the expression, we need to complete the square and 
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so we write 
(k-k . k — 2kk' + k? — 2ikxk? 
——— +ikx = 
2 Ie 
_ k? — 2k[k + ixk?] + [Kk + ixk?]? + 2ik'xk? — x?kt 
u 2k? 2k? 
[k spa ixk? P u xk? 
= SR +ikx z` (59) 
Therefore, we have to integrate 
= k-k —ixkP 22 
f exp J-E] x exp(ik’) exp | dk. (60) 
oo 1 


Notice that the terms after the ‘times’ sign do not contain the variable k and so these terms can be 
taken outside the integral. Now, we change variables to 


k-k —ixk 
= dk = 2K dy. 61 
y Jak, 1 dy (61) 
Thus, the integral reduces to 
x7k2 _ co 3 
exp(ik’x) exp Ba V 2k / e™ dy. (62) 


But, the integral 


oe 2 
Í e dy = Jz. 
—00 

Therefore, taking the real part of the expression in front of the definite integral, we find that the 
answer is 


x? k? 
vax) = [expats exp Ea vais} (63) 


x2 

= Acosk’x exp Ba ; (64) 
2x] 

where x; = kr! . Thus, the function y(x) has a Gaussian envelope with standard deviation x; which 

is related to the spread in wavenumbers kı by the relation x; = ki According to de Broglie’s 


relation k = 27r /à = 2x h/p, and so 


h 
Xp =, (65) 
20 


exactly Heisenberg’s relation (17.51). 

4. These are the rules which we teach our students in their first serious introduction to quantum 
mechanics. I am grateful to my colleagues Prof. Michael Payne and Dr. Howard Hughes for 
permission to reproduce these statements which were at the heart of their lecture courses on 
quantum mechanics. 


Chapter 18 


1. I give a number of simple examples of symmetry and conservation laws in Sect. 7.5 of my book 
Theoretical Concepts in Physics (Longair, 2003) starting from the Euler-Lagrange equations. 
2. An excellent summary of the early history of quantum tunnelling is given by Merzbacher (2002). 


Downloaded from Cambridge Books Online by IP 132.166.47.205 on Thu Jan 23 07:56:25 GMT 2014. 
http://dx.doi.org/10.1017/CBO9781139062060.021 
Cambridge Books Online © Cambridge University Press, 2014 





404 


Notes to pp. 376-85 


. A delightful account of the history of the Cockcroft and Walton experiment is contained in the 


book by Brian Cathcart The Fly in the Cathedral: How a Small Group of Cambridge Scientists 
Won the Race to Split the Atom (Cathcart, 2005) 


. This translation is taken from Pais’s book Inward Bound in which many more details of this 


complex story are recounted (Pais, 1985). 


. The section is an abbreviated version of sections of Chap. 3 and 4 of my book The Cosmic Century 


(2006). 


. This equation is derived in Sect. 13.2.2 in my book High Energy Astrophysics (Longair, 2011) 


where it is used to derive the expression (18.10). 
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